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srHIZOCHYTRIUM PKS GENES 



INTRODUCTION 

Field of the Invention 

This invention relates to modulating levels of enzymes and/or enzyme components 
capable of modifying long chain poly-unsaturated fatty acids (PUFAs) in a host cell, and 
constructs and methods for producing PUFAs in a host cell. The invention is exemplified by 
production of eicosapentenoic acid (EPA) using genes derived from Shewmella piitrefaciens 
and Vibrio marimis. 

Background 

Two main families of poly-unsaturated fatty acids (PUFAs) are the to3 fatty acids, 
exemplified by eicosapentenoic acid, and the «6 fatty acids, exemplified by arachidonic acid. 
PUFAs are important components of the plasma membrane of the cell, where they can be found 
in such forms as phospholipids, and also can be found in triglycerides. PUFAs also serve as 
precursors to other molecules of importance in human beings and animals, including the 
prostacyclins, leukotrienes and prostaglandins. Long chain PUFAs of importance include 
docosahexenoic acid (DHA) and eicosapentenoic acid (EPA), which are found primarily in 
different types offish oil, gamma-linolenic acid (GLA), which is found in the seeds of a number 
of plants, including evening primrose (Oenolhera biennis), borage {Borago officinalis) and black 
currants {Ribcs nigrum), stearidonic acid (SDA). which is found in marine oils and plant seeds, 
and arachidonic acid (ARA), which along with GLA is found in filamentous fungi. ARA can be 
purified from animal tissues including liver and adrenal gland. Several genera of marine bacteria 
are known which synthesize either EPA or DHA. DHA is present in human milk along with 
ARA. 

PUFAs are necessary for proper development, particularly in the developing infant brain, 
and for tissue formation and repair. As an example. DHA. is an important constituent of many 
human cell membranes, in particular nervous cells (gray matter), muscle cells, and spermatozoa 
and believed to affect the development of brain functions in general and to be essential for the 
development of eyesight. EPA and DHA have a number of nutritional and pharmacological 
uses. As an example adults affected by diabetes (especially non insulin-dependent) show 
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deficiencies and imbalances in their levels of DHA which are believed to contribute to later 
coronary conditions. Therefore a diet balanced in DHA may be beneficial to diabetics. 

For DHA, a number of sources exist for commercial production including a variety of 
marine organisms, oils obtained from cold water marine fish, and egg yolk fractions. The 
5 purification of DHA from fish sources is relatively expensive due to technical difficulties, 

making DHA expensive and in short supply. In algae such as Amphidinium and Schizochytrium 
and marine fimgi such as Thraustochytrium DHA may represent up to 48% of the fatty acid 
content of the cell. A few bacteria also are reported to produce DHA. These are generally deep 
sea bacteria such as Vibrio marinus. For ARA, microorganisms including the genera 
10 Mortierdla, Entomophthora, Phytium and Porphyridium can be used for commercial 
production. Commercial sources of SDA include the genera Trichodesma and Echium, 
Commercial sources of GLA include evening primrose, black currants and borage. However, 
there are several disadvantages associated with commercial production of PUFAs from natural 
sources. Natural sources of PUFA, such as animals and plants, tend to have highly 
1 5 heterogeneous oil compositions. The oils obtained fi'om these sources can require extensive 
purification to separate out one or more desired PUFA or to produce an oil which is enriched in 
one or more desired PUFA. 

Natural sources also are subject to uncontrollable fluctuations in availability. Fish stocks 
may undergo natural variation or may be depleted by overfishing. Animal oils, and particularly 
20 fish oils, can accumulate environmental pollutants. Weather and disease can cause fluctuation in 
yields from both flsh and plant sources. Cropland available for production of alternate oil- 
producing crops is subject to competition from the steady expansion of human populations and 
the associated increased need for food production on the remaining arable land. Crops which do 
produce PUFAs, such as borage, have not been adapted to commercial grov^h and may not 
25 perform well in monoculture. Growth of such crops is thus not economically competitive where 
more profitable and better established crops can be grown. Large -scale fermentation of 
organisms such as Shewanella also is expensive. Natural animal tissues contain low amounts of 
ARA and are difficult to process. Microorganisms such as Porphyridium and Shewanella are 
difficult to cultivate on a commercial scale. 
30 Dietary supplements and pharmaceutical formulations containing PUFAs can retain the 

disadvantages of the PUFA source. Supplements such as fish oil capsules can contain low levels 
of the particular desired component and thus require large dosages. High dosages result in 
ingestion of high levels of undesired components, including contaminants. Care must be taken 
in providing fatty acid supplements, as overaddition may result in suppression of endogenous 
35 biosynthetic pathways and lead to competition with other necessary fatty acids in various lipid 
fractions in vivo, leading to undesirable results. For example, Eskimos having a diet high in (o3 
fatty acids have an increased tendency to bleed (U.S. Pat. No. 4,874,603). Fish oils have 
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unpleasant tastes and odors, which may be impossible to economically separate from the desired 
product, such as a food supplements. Unpleasant tastes and odors of the supplements can make 
such regimens involving the supplement undesirable and may inhibit compliance by the patient. 

A ntimber of enzymes have been identified as being involved m PUFA 
biosynthesis. Linoleic acid (LA, 18:2a9, 12) is produced from oleic acid (18:1 A9)by a 
Al2-desaturase. GLA (18:3 A 6, 9, 12) is produced from linoleic acid (LA, 18:2 a9. 12) by a A6- 
desaturase. ARA (20:4 A 5. 8, 11, 14) is produced from DGLA (20:3 A 8. 1 1, 14), catalyzed by 
a A5-desaturase. Eicosapentenoic acid (EPA) is a 20 carbon, omega 3 fatty acid cont^rung 5 
double bonds (A 5, 8. 1 1. 14, 17), all in the cis configuration. EPA, and the related DHA (A 4, 7, 
10 13 16 19 C22-6) are produced from oleic acid by a series of elongation and desaturation 
reactions. Additionally, an elongase (or elongases) is required to extend the 18 carbon PUFAs 
out to 20 and 22 carbon chain lengths. However, animals cannot convert oleic acid (18:1 A 9) 
into linoleic acid (18:2 A 9. 12). Likewise, ^-linolenic acid (ALA, 18:3 A 9, 12. 15) cannot be 
synthesized by mammals. Other eukaryotes, including fungi and plants, have enzymes which 
desaturate at positions Al2 and Al5. The major polyunsaturated fatty acids of animals therefore 
are either derived from diet and/or from desaturation and elongation of linoleic acid (18:2 A 9, 
12) or n-linolenic acid (18:3 A 9. 12. 15). 

Polyunsaturated fatty acids are considered to be useful for nutritional, pharmaceutical, 
industrial, and other purposes. An expansive supply of polyunsaturated fatty acids from natural 
sources and from chemical synthesis are not sufficient for commercial needs. Because a number 
of separate desattirase and elongase enzymes are required for fatty acid synthesis from linoleic 
acid (LA 1 8-2 A 9, 12), common in most plant species, to the more saturated and longer cham 
PUFAs engineering plant host cells for the expression of EPA and DHA may require expression 
of five or six separate enzyme activities to achieve expression, at least for EPA and DHA. and 
for production of quantities of such PUFAs additional engineering efforts may be required, for 
instance the down regulation of enzymes competing for substrate, engineering of higher enzyme 
activities such as by mutagenesis or targeting of enzymes to plastid organelles. Therefore it is of 
interest to obtain genetic material involved in PUFA biosynthesis firom species that naturally 
produce these fatty acids and to express the isolated material alone or in combination m a 
heterologous system which can be manipulated to allow production of commercial quantities of 
PUFAs. 



35 



Relevant Literature 

Several genera of marine bacteria have been identified which synthesize either EPA or 
DHA (DeLong and Yayanos. Applied and Environmental Microbiology (1986) 51: 730-737). 
Researchers of the Sagami Chemical Research Institute have reported EPA production in £ coh 
which have been transformed with a gene cluster from the marine bacterium, Shewanella 
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putrefaciens. A minimum of 5 open reading frames (ORFs) are required for fatty acid synthesis 
of EPA in £ coli. To date, extensive characterization of the functions of the proteins encoded 
by these genes has not been reported (Yazawa (1996) Lipids 31, S-297; WO 93/23545; WO 
96/21735). 

5 The protein sequence of open reading frame (ORF) 3 as published by Yazawa, USPN 

5,683,898 is not a functional protein. Yazawa defines the protein as initiating at the methionine 
codon at nucleotides 9016-9014 of the Shewanella PKS-like cluster (Genbank accession 
U73935) and ending at the stop codon at nucleotides 8185-8183 of the Shewanella PKS-like 
cluster. However, when this ORF is expressed under control of a heterologous promoter in an £. 

1 0 coli strain containing the entire PKS-like cluster except ORP 3, the recombinant cells do not 
produce EPA. 

Polyketides are secondary metabolites the synthesis of which involves a set of enzymatic 
reactions analogous to those of fatty acid synthesis (see reviews: Hopwood and Sherman, Annu, 
Rev. Genet. (1990) 24; 37-66, and Katz and Donadio, in Annual Review of Microbiology (1993) 
15 47: 875-912). It has been proposed to use polyketide synthases to produce novel antibiotics 
(Hutchinson and Fujii, Annual Review of Microbiology (1995) 49:201-238). 

SUMMARY OF THE INVENTION 

Novel compositions and methods are provided for preparation of long chain poly- 
20 unsaturated fatty acids (PUFAs) using polyketide-like synthesis (PKS-like) genes in plants and 
plant cells. In contrast to the known and proposed methods for production of PUFAs by means 
of fatty acid synthesis genes, by the invention constructs and methods are provided for 
producing PUFAs by utilizing genes of a PKS-like system. The methods involve growing a host 
cell of interest transformed with an expression cassette functional in the host cell, the expression 
25 cassette comprising a transcriptional and translational initiation regulatory region, joined in 
reading frame 5' to a DNA sequence to a gene or component of a PKS-like system capable of 
modulating the production of PUFAs (PKS-like gene). An alteration in the PUFA profile of host 
cells is achieved by expression following introduction of a complete PKS-like system 
responsible for a PUFA biosynthesis into host cells. The invention finds use for example in the 
30 large scale production of DHA and EPA and for modification of the fatty acid profile of host 
cells and edible plant tissues and/or plant parts. 

BRIEF DESCRIPTION OF THE DRAWINGS 

Figure 1 provides designations for the ORFs of the EPA gene cluster of Shewanella, 
35 Figure 1 A shows the organization of the genes; those ORFs essential for EPA production in E. 
coli are numbered. Figure IB shows the designations given to subclones. 
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Figure 2 provides the Shewanella PKS-like domain structure, motifs and 'Blast' matches 
of ORF 6 (Figure 2A), ORF 7 (Figure 2B), ORF 8 (Figure 2C), ORF 9 (Figure 2D) and ORF 3 
(Figure 2E). Figure 2F shows the structure of the region of the Anabeana chromosome that is 
related to domains present in Shewanella EPA ORFs. 

5 Figure 3 shows results for pantethenylation - ORF 3 in E. coli strain S J 1 6. The image 

shows [C^] p-Alanine labelled proteins from E. coli (strain SJ16) cells transformed with the 
listed plasmids. Lane 1 represents pUC19, lane 2 represents pPA-NEB (A ORF 3), lane 3 
represents pAA-Neb (EPA+), lane 4 represents ORF 6 subclone, lane 5 represents ORF 6 + ORF 
3 subclones, and lane 6 represents ORF 3 subclone. ACP and an unknown (but previously 

1 0 observed) 3 5 kD protein were labelled in all of the samples. The high molecular mass proteins 
detected in lanes 2 and 5 are full-length (largest band) and truncated products of the Shewanella 
ORF-6 gene (confirmed by Western analysis). E. Coli strain SJ16 is conditionally blocked in P- 
alanine synthesis. 

Figure 4A shows the DNA sequence (SEQ ID NO: 1 ) for the PKS-like cluster found in 

15 Shewanella, containing ORF's 3-9. Figure 4B shows the amino acid sequence (SEQ ID N0:2) 
of ORF 2, which is coded by nucleotides 6121-8103 of the sequence shown in Fig 4A. Figure 
4C shows the amino acid sequence (SEQ ID N0:3) of the published, inactive 0RF3, translated 
from the strand complementary to that shown in Figure 4A, nucleotides 9016-8186. Figure 4D 
shows the nucleotide sequence 8186-9157 (SEQ ID N0:4); its complementary strand codes for 

20 ORF 3 active in EPA synthesis. Figures 4E-J show the amino acid sequences (SEQ ID N0S:5- 
10) corresponding to ORF's 4-9, which are encoded by nucleotides 9681-12590 (SEQ ID 
N0:81), 13040-13903 (SEQ ID NO:82), 13906-22173 (SEQ ID NO:83), 22203-24515 (SEQ ID 
NO:84), 24518-30529 (SEQ ID NO:85) and 30730-32358 (SEQ ID NO:86), respectively, of 
Figure 4 A. Figure 4K shows the amino acid sequence (SEQ ID NO: 1 1) corresponding to 

25 nucleotides 32834-34327. 

Figure 5 shows the sequence (SEQ ID NO: 12) for the PKS-like cluster in an 
approximately 40 kb DNA fragment of Vibrio marinus, containing ORFs 6, 7, 8 and 9. The start 
and last codons for each ORF are as follows: ORF 6: 17394, 25352; ORF 7: 25509, 28160; ORF 
8: 28209, 34265; ORF 9: 34454, 36118. 

30 Figure 6 shows the sequence (SEQ ID NO: 13) for an approximately 1 9 kb portion of the 

PKS-like cluster of Figure 5 which contains the ORFs 6, 7, 8 and 9. The start and last codons 
for each ORF are as follows: ORF 6: 41 1, 8369 (SEQ ID NO:77); ORF 7: 8526, 1 1 177 (SEQ 
IDNO:78); ORF 8: 11226, 17282 (SEQ IDNO:79); ORF 9: 17471, 19135 (SEQ IDNO:80). 
Figure 7 shows a comparison of the PKS-like gene clusters of Shewanella putrefaciens 

35 and Vibrio marinus; Figure 7B is the Vibrio marinus operon sequence. 
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Figure 8 is an expanded view of the PKS-like gene cluster portion of Vibrio marims 
shown in Figure 7B showing that ORFs 6, 7 and 8 are in reading frame 2, while ORF 9 is in 
reading frame 3. 

Figure 9 demonstrates sequence homology of ORF 6 oi Shewanella putrefaciens and 
Vibrio marinus. The Shewanella ORF 6 is depicted on the vertical axis, and the Vibrio ORF 6 is 
depicted on the horizontal axis. Lines indicate regions of the proteins that have a 60% identity. 
The repeated lines in the middle correspond to the multiple ACP domains found in ORF 6. 

Figure 10 demonstrates sequence homology of ORF 7 of Shewanella putrefaciem and 
Vibrio marinus. The Shewanella ORF 7 is depicted on the vertical axis, and the Vibrio ORF 7 is 
depicted on the horizontal axis. Lines indicate regions of the proteins that have a 60% identity. 

Figure 1 1 demonstrates sequence homology of ORF 8 of Shewanella putrefaciem and 
Vibrio marinus. The Shewanella ORF 8 is depicted on the vertical axis, and the Vibro. ORF 8 
is depicted on the horizontal axis. Lines indicate regions of the proteins that have a 60% 
identity. 

Figure 12 demonstrates sequence homology of ORF 9 of Shewanella putrefaciens and 
Vibrio marinus. The Shewanella ORF 9 is depicted on the vertical axis, and the Vibrio ORF 9 is 
depicted on the horizontal axis. Lines indicate regions of the proteins that have a 60% identity. 

Figure 13 is a depiction of various complementation experiments, and resulting PUFA 
production. On the right, is shown the longest PUFA made in the E. coli strain containing the 
Vibrio and Shewanella genes depicted on the left. The hollow boxes indicate ORFs from 
Shewanella. The solid boxes indicate ORFs from Vibrio. 

Figure 14 is a chromatogram showing fatty acid production from complementation of 
pEPAD8 from Shewanella (deletion ORF 8) with ORF 8 from Shewanella, in E. coli Fad E-. 
The chromatogram presents an EPA (20:5) peak. 

Figure 15 is a chromatogram showing fatty acid production from complementation of 
pEPAD8 from Shewanella (deletion ORF 8) with ORF 8 from Vibrio marinus, in E. coli Fad E-. 
The chromatograph presents EPA (20:5) and DHA (22:6) peaks. 

Figure 16 is a table of PUFA values from the ORF 8 complementation experiment, the 
chromatogram of which is shown in Figure 15. 

Figure 17 is a plasmid map showing the elements of pCGN7770. 
Figure 18 is a plasmid map showing the elements of pCGN8535. 
Figure 19 is a plasmid map showing the elements of pCGN8537. 
Figure 20 is a plasmid map showing the elements of pCGN8525. 
Figure 21 is a comparison of the Shewanella ORFs as defined by Yazawa (1996) supra, 
and those disclosed in Figure 4. When a protein starting at the leucine (TTG) codon at 
nucleotides 9157-9155 and ending at the stop codon at nucleotides 8185-8183 is expressed 
under control of a heterologous promoter in an E. coli strain containing the entire PKS-like 
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cluster except ORF 3, the recombinant cells do produce EPA. Thus, the published protein 
sequence is likely to be wrong, and the coding sequence for the protein may start at the TTG 
codon at nucleotides 9157-9155 or the TTG codon at nucleotides 9172-9170. This information 
is critical to the expression of a functional PKS-like cluster heterologous system. 
5 Figure 22 is a plasmid map showing the elements of pCGN8560. 

Figure 23 is plasmid map showing the elements of pCGN8556. 

Figure 24 shows the translated DNA sequence (SEQ ID N0:14) upstream of the 
published ORF 3 and the corresponding amino acids for which they code (SEQ ID NO: 15). The 
ATG start codon at position 9016 is the start codon for the protein described by Yazawa et al 
10 (1996) supra. The other arrows depict TTG or ATT codons that can also serve as start codons in 
bacteria. When ORF 3 is started from the published ATG codon at 9016, the protein is not 
functional in making EPA. When ORF 3 is initiated at the TTG codon at position 9157, the 
protein is capable of facilitating EPA synthesis. 

Figure 25 shows the PGR product (SEQ ID NO: 16) for SS9 Photobacter using primers in 
15 Example 1. 

Figure 26 shows probe sequences (SEQ ID NOS:17-31) resulting from PGR with 
primers presented in Example 1 . 

Figure 27 shows the nucleotide sequence of Schizochytrium EST clones A. LIB 3033- 
047-B5, LIB3033-046-E6 and a bridging PGR product have now been assembled into a partial 
20 cDNA sequence (0RF6 homolog), B. LIB3033-046-D2 (hglc/ORF7/ORF8/ORF9 homolog), C. 
LIB81-015-D5, LIB81-042-B9 and a bridging PGR product have now been assembled into a 
partial cDNA sequence (ORF8/ORF9 homolog). 

Figure 28 shows a schematic of the similarities between Shewanella PKS sequences and 
Schizochytrium sequences. 
25 Figure 29 shows the amino acid sequences inferred from Schizochytrium EST clones A. 

0RF6 homolog, B. hglc/ORF7/ORF8/ORF9 homolog, G, ORF8/ORF9 homolog. 

DESCRIPTION OF THE PREFERRED EMBODIMENTS 

In accordance with the subject invention, novel DNA sequences, DNA constructs and 
30 methods are provided, which include some or all of the polyketide-like synthesis (PKS-like) 
pathway genes from Shewanella, Vibrio, Schizochytrium or other microorganisms, for 
modifying the poly-unsaturated long chain fatty acid content of host cells, particularly host plant 
cells. The present invention demonstrates that EPA synthesis genes in Shewanella putrefaciens 
constitute a polyketide-like synthesis pathway. Functions are ascribed to the Shewanella, 
35 Schizochytrium and Vibrio genes and methods are provided for the production of EPA and DHA 
in host cells. The method includes the step of transforming cells with an expression cassette 
comprising a DNA encoding a polypeptide capable of increasing the amount of one or more 
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PUFA in the host cell. Desirably, integration constructs are prepared which provide for 
integration of the expression cassette into the genome of a host cell. Host cells are manipulated 
to express a sense or antisense DNA encoding a polypeptide(s) that has PKS-like gene activity. 
By "PKS-like gene" is intended a polypeptide which is responsible for any one or more of the 
functions of a PKS-like activity of interest. By "polypeptide" is meant any chain of amino 
acids, regardless of length or post-translational modification, for example, glycosylation or 
phosphorylation. Depending upon the nature of the host cell, the substrate(s) for the expressed 
enzyme may be produced by the host cell or may be exogenously supplied. Of particular 
interest is the selective control of PUFA production in plant tissues and/or plant parts such as 
leaves, roots, fruits and seeds. The invention can be used to synthesize EPA, DHA, and other 
related PUP As in host cells. 

There are many advantages to transgenic production of PUFAs. As an example, in 
transgenic E. coli as in Shewanella, EPA accumulates in the phospholipid firaction, specifically 
in the sn-2 position. It may be possible to produce a structured lipid in a desired host cell which 
differs substantially from that produced in either Shewanella or E. coli. Additionally transgenic 
production of PUFAs in particular host cells offers several advantages over purification firom 
natural sources such as fish or plants. In transgenic plants, by utilizing a PKS-like system, fatty 
acid synthesis of PUFAs is achieved in the cytoplasm by a system which produces the PUFAs 
through de novo production of the fatty acids utilizing malonyl Co-A and acetyl Co-A as 
substrates. In this fashion, potential problems, such as those associated with substrate 
competition and diversion of normal products of fatty acid synthesis in a host to PUFA 

production, are avoided. 

Production of fatty acids from recombinant plants provides the ability to alter the 
naturally occurring plant fatty acid profile by providing new synthetic pathways in the host or by 
suppressing undesired pathways, thereby increasing levels of desired PUFAs, or conjugated 
forms thereof, and decreasing levels of undesired PUFAs. Production of fatty acids in 
transgenic plants also offers the advantage that expression of PKS-like genes in particular tissues 
and/or plant parts means that greatly increased levels of desired PUFAs in those tissues and/or 
parts can be achieved, making recovery firom those tissues more economical. Expression in a 
plant tissue and/or plant part presents certain efficiencies, particularly where the tissue or part is 
one which is easily harvested, such as seed, leaves, fruits, flowers, roots, etc. For example, the 
desired PUFAs can be expressed in seed; methods of isolating seed oils are well established. In 
addition to providing a source for purification of desired PUFAs, seed oil components can be 
manipulated through expression of PKS-like genes, either alone or in combination with other 
genes such as elongases, to provide seed oils having a particular PUFA profile in concentrated 
form. The concentrated seed oils then can be added to animal milks and/or synthetic or 
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semisynthetic milks to serve as infant formulas where human nursing is impossible or undesired, 
or in cases of malnourishment or disease in both adults and infants. 

Transgenic microbial production of fatty acids offers the advantages that many microbes 
are known with greatly simplified oil compositions as compared with those of higher organisms, 
making purification of desired components easier. Microbial production is not subject to 
fluctuations caused by external variables such as weather and food supply. Microbially 
produced oil is substantially free of contamination by environmental pollutants. Additionally, 
microbes can provide PUFAs in particular forms which may have specific uses. For example, 
Spirulina can provide PUFAs predominantly at the first and third positions of triglycerides; 
digestion by pancreatic lipases preferentially releases fatty acids from these positions. 
Following human or animal ingestion of triglycerides derived from Spirulina, these PUFAs are 
released by pancreatic lipases as free fatty acids and thus are directly available, for example, for 
infant brain development. Additionally, microbial oil production can be manipulated by 
controlling culture conditions, notably by providing particular substrates for microbially 
expressed enzymes, or by addition of compounds which suppress undesired biochemical 
pathways. In addition to these advantages, production of fatty acids from recombinant microbes 
provides the ability to alter the naturally occurring microbial fatty acid profile by providing new 
synthetic pathways in the host or by suppressing undesired pathways, thereby increasing levels 
of desired PUFAs, or conjugated forms thereof, and decreasing levels of undesired PUFAs. 

Production of fatty acids in animals also presents several advantages. Expression of 
desaturase genes in animals can produce greatly increased levels of desired PUFAs in animal 
tissues, making recovery from those tissues more economical. For example, where the desired 
PUFAs are expressed in the breast milk of animals, methods of isolating PUFAs from animal 
milk are well established. In addition to providing a source for purification of desired PUFAs, 
animal breast milk can be manipulated through expression of desaturase genes, either alone or in 
combination with other human genes, to provide animal milks with a PUFA composition 
substantially similar to human breast milk during the different stages of infant development. 
Humanized animal milks could serve as infant formulas where human nursing is impossible or 
undesired, or in the cases of malnourishment or disease. 

DNAs encoding desired PKS-like genes can be identified in a variety of ways. In one 
method, a source of a desired PKS-like gene, for example genomic libraries from ^.Shewanella, 
Schizochytrium or Vibrio spp., is screened with detectable enzymatically- or chemically- 
synthesized probes. Sources of ORFs having PKS-like genes are those organisms which produce 
a desired PUFA, including DHA-producing or EPA-producing deep sea bacteria growing 
preferentially under high pressure or at relatively low temperature. Microorgansims such as 
Shewanella which produce EPA or DHA also can be used as a source of PKS-like genes. The 
probes can be made from DNA, RNA, or non-naturally occurring nucleotides, or mixtures 



wo 00/42195 



PCT/USOO/00956 



10 



thereof Probes can be enzymatically synthesized from DNAs of known PKS-like genes for 
normal or reduced-stringency hybridization methods. For discussions of nucleic acd probe 
design and annealing conditions, see, for example. Sambrook et al. Molecular Clomng: A 
Laboratory Manual (2"" ed.). Vols. 1-3, Cold Spring Harbor Laboratory, (1989) or Currertt 
Protocols in Molecular Biology, F. Ausubel et al, ed., Greene Publishing and Wiley- 
Interscience, New York (1987), each of which is incorporated herein by reference. Techmques 
for manipulation of nucleic acids encoding PUFA enzymes such as subcloning nucleic acid 
sequences encoding polypeptides into expression vectors, labelling probes, DNA hybridization, 
and the like are described generally in Sambrook, supra. 

Oligonucleotide probes also can be used to screen sources and can be based on sequences 
of known PKS-like genes, including sequences conserved among known PKS-like genes, or on 
peptide sequences obtained from a desired purified protein. Oligonucleotide probes based on 
amino acid sequences can be degenerate to encompass the degeneracy of the genetic code, or can 
be biased in favor of the preferred codons of the source organism. Alternatively, a desired 
protein can be entirely sequenced and total synthesis of a DNA encoding that polypeptide 
performed. 

Once the desired DNA has been isolated, it can be sequenced by known methods. It is 
recognized in the art that such methods are subject to errors, such that multiple sequencing of the 
same region is routine and is still expected to lead to measurable rates of mistakes in the 
resuhing deduced sequence, particularly in regions having repeated domains, extensive 
secondary structure, or unusual base compositions, such as regions with high GC base content. 
When discrepancies arise, resequencing can be done and can employ special methods. Special 
methods can include altering sequencing conditions by using: different temperatures; different 
enzymes; proteins which alter the ability of oligonucleotides to form higher order structures; 
altered nucleotides such as ITP or methylated dGTP; different gel compositions, for example 
adding formamide; different primers or primers located at different distances from the problem 
region; or different templates such as single stranded DNAs. Sequencing of mRNA can also be 
employed. 

For the most part, some or all of the coding sequences for the polypeptides having PKS- 
like gene activity are from a natural source. In some situations, however, it is desirable to 
modity all or a portion of the codons, for example, to enhance expression, by employing host 
preferred codons. Host preferred codons can be determined from the codons of highest 
frequency in the proteins expressed in the largest amount in a particular host species of interest. 
Thus, the coding sequence for a polypeptide having PKS-like gene activity can be synthesized 
in whole or in part. All or portions of the DNA also can be synthesized to remove any 
destabilizing sequences or regions of secondary structure which would be present in the 
transcribed mRNA. All or portions of the DNA also can be synthesized to alter the base 
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composition to one more preferable to the desired host cell. Methods for synthesizing sequences 
and bringing sequences together are well established in the literature. In vitro mutagenesis and 
selection, site-directed mutagenesis, or other means can be employed to obtain mutations of 
naturally occurring PKS-like genes to produce a polypeptide having PKS-like gene activity in 
5 vivo with more desirable physical and kinetic parameters for function in the host cell, such as a 
longer half-life or a higher rate of production of a desired polyunsaturated fatty acid. 

Of particular interest are the Shewanella putrefaciens ORFs and the corresponding ORFs 
of Vibrio marinus and Schizochytrium. The Shewanella putrefaciens PKS-like genes can be 
expressed in transgenic plants to effect biosynthesis of EPA. Other DNAs which are 
10 substantially identical in sequence to the Shewanella putrefaciens PKS-like genes, or which 
encode polypeptides which are substantially similar to PKS-like genes of Shewanella 
putrefaciens can be used, such as those identified from Vibrio marinus or Schizochytrium, By 
substantially identical in sequence is intended an amino acid sequence or nucleic acid sequence 
exhibiting in order of increasing preference at least 60%, 80%, 90% or 95% homology to the 
15 DNA sequence of the Shewanella putrefaciens PKS-like genes or nucleic acid sequences 

encoding the amino acid sequences for such genes. For polypeptides, the length of comparison 
sequences generally is at least 16 amino acids, preferably at least 20 amino acids, and most 
preferably 35 amino acids. For nucleic acids, the length of comparison sequences generally is at 
least 50 nucleotides, preferably at least 60 nucleotides, and more preferably at least 75 
20 nucleotides, and most preferably, 1 10 nucleotides. 

Homology typically is measured using sequence analysis software, for example, the 
Sequence Analysis software package of the Genetics Computer Group, University of Wisconsin 
Biotechnology Center, 1710 University Avenue, Madison, Wisconsin 53705, MEGAlign 
(DNAStar, Inc., 1228 S. Park St., Madison, Wisconsin 53715), and MacVector (Oxford 
25 Molecular Group, 2105 S. Bascom Avenue, Suite 200, Campbell, California 95008). BLAST 
(National Center for Biotechnology Information (WCBI) www.ncbi.nlm.gov; FASTA (Pearson 
and Lipman, Science (1985) 227:1435-1446). Such software matches similar sequences by 
assigning degrees of homology to various substitutions, deletions, and other modifications. 
Conservative substitutions typically include substitutions within the following groups: glycine 
30 and alanine; valine, isoleucine and leucine; aspartic acid, glutamic acid, asparagine, and 
glutamine; serine and threonine; lysine and arginine; and phenylalanine and tyrosine. 
Substitutions may also be made on the basis of conserved hydrophobicity or hydrophilicity 
(Kyte and Doolittle, J. Mol Biol (1982) 157: 105-132), or on the basis of the ability to assume 
similar polypeptide secondary structure (Chou and Fasman, Adv, EnzymoL (1978) 47: 45-148, 
35 1 978). A related protein to the probing sequence is identified when p > 0.0 1 , preferably p > 1 0 " 
orlO"^ 
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Encompassed by the present invention are related PKS-like genes from the same or other 
organisms. Such related PKS-like genes include variants of the disclosed PKS-like ORFs that 
occur naturally within the same or different species oi Shewanella, as well as homologues of the 
disclosed PKS-like genes from other species and evolutionary related protems havmg 
analogous function and activity. Also included are PKS-like genes which, although not 
substantially identical to the She^vanella putrefaciens PKS-like genes, operate in a sinailar 
fashion to produce PUFAs as part of a PKS-like system. Related PKS-like genes can be 
identified by their ability to function substantially the same as the disclosed PKS-hke genes; that 
is they can be substituted for corresponding ORFs of She^anella, Schizochytrium or VWno and 
still effectively produce EPA or DHA. Related PKS-like genes also can be identified by 
screening sequence databases for sequences homologous to the disclosed PKS-like genes, by 
hybridization of a probe based on the disclosed PKS-like genes to a library constructed firom the 
source organism, or by RT-PCR using mRNA from the source organism and primers based on 
the disclosed PKS-like gene. Thus, the phrase "PKS-like genes" refers not only to the nucleotide 
sequences disclosed herein, but also to other nucleic acids that are allelic or species vanants of 
these nucleotide sequences. It is also understood that these terms include nomiatural mutaUons 
introduced by deliberate mutation using recombinant technology such as single site mutation or 
by excising short sections of DNA open reading fi-ames coding for PUFA enzymes or by 
substituting new codons or adding new codons. Such minor alterations substantially mamtain 
the immunoidentity of the original expression product and/or its biological activity. The 
biological properties of the altered PUFA enzymes can be determined by expressing the 
enzymes in an appropriate cell line and by determining the ability of the enzymes to syntiiesize 
PUFAS Particular enzyme modifications considered minor would include substitution of amino 
acids of similar chemical properties, e.g., glutamic acid for aspartic acid or glutamine for 

asparagine. . . ^f^wo 

When utilizing a PUFA PKS-like system from another orgamsm, the regions of a PKS- 
like gene polypeptide important for PKS-like gene activity can be determined through routine 
mutagenesis, expression of the resulting mutant polypeptides and determination of their 
activities. The coding region for the mutants can include deletions, insertions and point 
mutations, or combinations thereof A typical functional analysis begins with deletion 
mutagenesis to determine the N- and C-terminal limits of the protein necessary for function and 
then internal deletions, insertions or point mutants are made in the open ready frame to fiirther 
determine regions necessary for function. Other techniques such as cassette mutagenesis or total 
synthesis also can be used. Deletion mutagenesis is accomplished, for example, by usmg 
exonucleases to sequentially remove the 5' or 3' coding regions. Kits are available for such 
techniques. After deletion, the coding region is completed by ligating oligonucleotides 
containing start or stop codons to the deleted coding region after 5' or 3' deletion, respectively. 
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Alternatively, oligonucleotides encoding start or stop codons are inserted into the coding region 
by a variety of methods including site-directed mutagenesis, mutagenic PGR or by ligation onto 
DNA digested at existing restriction sites. Internal deletions can similarly be made through a 
variety of methods including the use of existing restriction sites in the DNA, by use of 
5 mutagenic primers via site directed mutagenesis or mutagenic PGR, Insertions are made through 
methods such as linker-scanning mutagenesis, site-directed mutagenesis or mutagenic PGR. 
Point mutations are made through techniques such as site-directed mutagenesis or mutagenic 
PGR. 

Ghemical mutagenesis also can be used for identifying regions of a PKS-like gene 

10 polypeptide important for activity. A mutated construct is expressed, and the ability of the 
resulting altered protein to function as a PKS-like gene is assayed. Such structure-function 
analysis can determine which regions may be deleted, which regions tolerate insertions, and 
which point mutations allow the mutant protein to function in substantially the same way as the 
native PKS-like gene. AH such mutant proteins and nucleotide sequences encoding them are 

1 5 within the scope of the present invention. EPA is produced in Shewanella as the product of a 
PKS-like system, such that the EPA genes encode components of this system. In Vibrio, DHA is 
produced by a similar system. The enzymes which synthesize these fatty acids are encoded by a 
cluster of genes which are distinct firom the fatty acid synthesis genes encoding the enzymes 
involved in synthesis of the G16 and G18 fatty acids typically found in bacteria and in plants. 

20 As the Shewanella EPA genes represent a PKS-like gene cluster, EPA production is, at least to 
some extent, independent of the typical bacterial type II FAS system. Thus, production of EPA 
in the cytoplasm of plant cells can be achieved by expression of the PKS-like pathway genes in 
plant cells under the control of appropriate plant regulatory signals. 

EPA production in E. coli transformed with the Shewanella EPA genes proceeds during 

25 anaerobic growth, indicating that 02-dependent desaturase reactions are not involved. Analyses 
of the proteins encoded by the ORFs essential for EPA production reveals the presence of 
domain structures characteristic of PKS-like systems. Fig. 2A shows a summary of the domains, 
motifs, and also key homologies detected by "BLAST" data bank searches. Because EPA is 
different from many of the other substances produced by PKS-like pathways, i.e., it contains 5, 

30 cis double bonds, spaced at 3 carbon intervals along the molecule, a PKS-like system for 
synthesis of EPA is not expected. 

Further, BLAST searches using the domains present in the Shewanella EPA ORFs reveal 
that several are related to proteins encoded by a PKS-like gene cluster found in Anabeana. The 
structure of that region of the Anabeana chromosome is shovm in Fig. 2F. The Anabeana PKS- 

35 like genes have been linked to the synthesis of a long-chain (G26), hydroxy-fatty acid found in a 
glycolipid layer of heterocysts. The EPA protein domains with homology to the Anabeana 
proteins are indicated in Fig, 2F. 
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ORF 6 of Shewanella contains a KAS domain which includes an active site motif 
(DXAC*), SEQ ID NO:32, as well as a "GFGG", SEQ ID NO:33, motif which is present at the 
end of many Type II KAS proteins (see Fig. 2A). Extended motifs are present but not shown 
here. Next is a malonyl-CoA:ACP acyl transferase (AT) domain. Sequences near the active site 
motif (GHS*XG), SEQ ID NO:34, suggest it transfers malonate rather than methylmalonate, i.e., 
it resembles the acetate-like ATs. Following a linker region, there is a cluster of 6 repeating 
domains, each -100 amino acids in length, which are homologous to PKS-like ACP sequences. 
Each contains a pantetheine binding site motif (LGXDS*(L/I)), SEQ ID NOS:35 and 36. The 
presence of 6 such ACP domains has not been observed previously in fatty acid synthases (FAS) 
or PKS-like systems. Near the end of the protein is a region which shows homology to 6-keto- 
ACP reductases (KR). It contains a pyridine nucleotide binding site motif "GXGXX(G/AyP)", 
SEQIDNOS:37,38and39. 

The Shewanella ORF 8 begins with a KAS domain, including active site and ending 
motifs (Fig. 2C). The best match in the data banks is with the Anabeana HglD. There is also a 
domain which has sequence homology to the N- terminal one half of the Anabeana HglC. This 
region also shows weak homology to KAS proteins although it lacks the active site and ending 
motifs. It has the characteristics of the so-called chain length factors (CLF) of Type II PKS-like 
systems. ORF 8 appears to direct the production of EPA versus DHA by the PKS-like system. 
ORF 8 also has two domains with homology to B-hydroxyacyl-ACP dehydrases (DH). The best 
match for both domains is with E. coli FabA, a bi-functional enzyme which carries out both the 
dehydrase reaction and an isomerization {tram to cis) of the resulting double bond. The first 
DH domain contains both the active site histidine (H) and an adjacent cysteine (C) implicated in 
FabA catalysis. The second DH domain has the active site H but lacks the adjacent C (Fig. 2C). 
Blast searches with the second DH domain also show matches to FabZ, a second E. coli DH, 
25 which does not possess isomerase activity. 

The N-terminal half of ORF 7 (Fig. 2B) has no significant matches in the data banks. 
The best match of the C-terminal half is with a C-terminal portion of the Anabeana HglC. This 
domain contains an acyl-transferase (AT) motif (GXSXG), SEQ ID NO:40. Comparison of the 
extended active site sequences, based on the crystal structure of the E. coli malonyl-CoA:ACP 
30 AT, reveals that ORF 7 lacks two residues essential for exclusion of water from the active site 
(£. coli nomenclature; Ql 1 and Rl 17). These data suggest that ORF 7 may function as a 
thioesterase. 

ORF 9 (Fig. 2D) is homologous to an ORF of unknown function in the Anabeana Hgl 
cluster. It also exhibits a very weak homology to NIFA, a regulatory protein in nitrogen fixing 
35 bacteria. A regulatory role for the ORF 9 protein has not been excluded. ORF 3 (Fig. 2E) is 
homologous to the Anabeana Hetl as well as EntD from E. coli and Sfp of Bacillus. Recently, a 
new enzyme family of phosphopantetheinyl transferases has been identified that includes Hetl, 



20 
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EntD and Sfp (Lamblot RH, et al. (1996) A new enzyme superfamily - the phophopantetheinyl 
transferases. Chemistry & Biology, Vol 3, #11, 923-936 ). The data of Fig. 3 demonstrates that 
the presence of ORF 3 is required for addition of C-alanine (i.e. pantetheine) to the ORF 6 
protein. Thus, ORF 3 encodes the phosphopantetheinyl transferase specific for the ORF 6 ACP 
domains. {See, Haydock SF et al. (1995) Divergent sequence motifs correlated with the 
substrate specificity of (methyl)malonyl-CoA:acyl carrier protein transacylase domains in 
modular polyketide synthases, FEES Lett., 374, 246-248). Malonate is the source of the carbons 
utUized in the extension reactions of EPA synthesis. Additionally, malonyl-CoA rather than 
malonyl-ACP is the AT substrate, i.e., the AT region of ORF 6 uses malonyl Co-A. 

Once the DNA sequences encoding the PKS-like genes of an organism responsible for 
PUFA production have been obtained, they are placed in a vector capable of replication in a host 
cell, or propagated in vitro by means of techniques such as PCR or long PGR. Replicating 
vectors can include plasmids, phage, viruses, cosmids and the like. Desirable vectors include 
those useful for mutagenesis of the gene of interest or for expression of the gene of interest in 
host cells. A PUFA synthesis enzyme or a homologous protein can be expressed in a variety of 
recombinantly engineered cells. Numerous expression systems are available for expression of 
DNA encoding a PUFA enzyme. The expression of natural or synthetic nucleic acids encoding 
PUFA enzyme is typically achieved by operably linking the DNA to a promoter (which is either 
constitutive or inducible) within an expression vector. By expression vector is meant a DNA 
molecule, linear or circular, that comprises a segment encoding a PUFA enzyme, operably 
linked to additional segments that provide for its transcription. Such additional segments 
include promoter and terminator sequences. An expression vector also may include one or more 
origins of replication, one or more selectable markers, an enhancer, a polyadenylation signal, etc. 
Expression vectors generally are derived from plasmid or viral DNA, and can contain elements 
of both. The term "operably linked" indicates that the segments are arranged so that they 
function in concert for their intended purposes, for example, transcription initiates in the 
promoter and proceeds through the coding segment to the terminator. See Sambrook et al, 
supra. 

The technique of long PCR has made in vitro propagation of large constructs possible, so 
that modifications to the gene of interest, such as mutagenesis or addition of expression signals, 
and propagation of the resulting constructs can occur entirely in vitro without the use of a 
replicating vector or a host cell. In vitro expression can be accomplished, for example, by 
placing the coding region for the desaturase polypeptide in an expression vector designed for in 
vitro use and adding rabbit reticulocyte lysate and cofactors; labeled amino acids can be 
incorporated if desired. Such in vitro expression vectors may provide some or all of the 
expression signals necessary in the system used. These methods are well known in the art and 
the components of the system are commercially available. The reaction mixture can then be 
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assayed directly for PKS-like enzymes for example by determining their activity, or the 
synthesized enzyme can be purified and then assayed. 

Expression in a host cell can be accomplished in a transient or stable fashion. Transient 
expression can occur from introduced constructs which contain expression signals functional in 

5 the host cell, but which constructs do not replicate and rarely integrate m the host cell, or where 
the host cell is not proliferating. Transient expression also can be accomplished by inducing the 
activity of a regulatable promoter operably linked to the gene of interest, although such inducible 
systems frequently exhibit a low basal level of expression. Stable expression can be achieved by 
introduction of a nucleic acid construct that can integrate into the host genome or that 

1 0 autonomously replicates in the host cell. Stable expression of the gene of interest can be 

selected for through the use of a selectable marker located on or transfected with the expression 
construct, followed by selection for cells expressing the marker. When stable expression results 
from integration, integration of constructs can occur randomly within the host genome or can be 
targeted through the use of constructs containing regions of homology with the host genome 

15 sufficient to target recombination with the host locus. Where constructs are targeted to an 

endogenous locus, all or some of the transcriptional and translational regulatory regions can be 
provided by the endogenous locus. To achieve expression in a host cell, the transformed DNA is 
operably associated with transcriptional and translational initiation and termination regulatory 
regions that are functional in the host cell, 

20 Transcriptional and translational initiation and termination regions are derived from a 

variety of nonexclusive sources, including the DNA to be expressed, genes known or suspected 
to be capable of expression in the desired system, expression vectors, chemical synthesis The 
termination region can be derived from the 3' region of the gene from which the initiation region 
was obtained or from a different gene. A large number of termination regions are known to and 

25 have been found to be satisfactory in a variety of hosts from the same and different genera and 
species. The termination region usually is selected more as a matter of convenience rather than 
because of any particular property. When expressing more than one PKS-like ORF in the same 
cell, appropriate regulatory regions and expression methods should be used. Introduced genes 
can be propagated in the host cell through use of replicating vectors or by integration into the 

30 host genome. Where two or more genes are expressed from separate replicating vectors, it is 
desirable that each vector has a different means of replication. Each introduced construct, 
whether integrated or not, should have a different means of selection and should lack homology 
to the other constructs to maintain stable expression and prevent reassortment of elements 
among constructs. Judicious choices of regulatory regions, selection means and method of 

35 propagation of the introduced construct can be experimentally determined so that all introduced 
genes are expressed at the necessary levels to provide for synthesis of the desired products. 
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A variety of procaryolic expression systems can be used to express PUFA enzyme. 
Expression vectors can be constructed wiiich contain a promoter to direct transcription, a 
ribosome binding site, and a transcriptional terminator. Examples of regulatory regions suitable 
for this purpose in E. coli are the promoter and operator region of the E. coli tryptophan 
biosynthetic pathway as described by Yanofsky (1984)7. BacterioL, 158:1018-1024 and the 
leftward promoter of phage lambda (PX) as described by Herskowitz and Hagen, (1980) Ann. 
Rev. Genet., 14:399-445. The inclusion of selection markers in DNA vectors transformed in 
Ecoli is also useful. Examples of such markers include genes specifying resistance to 
ampicillin, tetracycline, or chloramphenicol. Vectors used for expressing foreign genes in 
bacterial hosts generally will contain a selectable marker, such as a gene for antibiotic resistance, 
and a promoter which functions in the host cell. Plasmids useful for transforming bacteria 
include pBR322 (Bolivar, et al, (1977) Gem 2:95-1 13), the pUC plasmids (Messing, (1983) 
Meth. Enzymol. 101:20-77, Vieira and Messing, (1982) Gene 19:259-268), pCQV2 (Queen, 
ibid), and derivatives thereof. Plasmids may contain both viral and bacterial elements. 
Methods for the recovery of the proteins in biologically active form are discussed in U.S. Patent 
Nos. 4,966,963 and 4,999,422, which are incorporated herein by reference. See Sambrook, et al 
for a description of other prokaryotic expression systems. 

For expression in eukaryotes, host cells for use in practicing the present invention 
include mammalian, avian, plant, insect, and fungal cells. As an example, for plants, the choice 
of a promoter will depend in part upon whether constitutive or inducible expression is desired 
and whether it is desirable to produce the PUFAs at a particular stage of plant development 
and/or in a particular tissue. Considerations for choosing a specific tissue and/or developmental 
stage for expression of the ORFs may depend on competing substrates or the ability of the host 
cell to tolerate expression of a particular PUFA. Expression can be targeted to a particular 
location within a host plant such as seed, leaves, fruits, flowers, and roots, by using specific 
regulatory sequences, such as those described in USPN 5,463,174. USPN 4,943,674, USPN 
5,106,739, USPN 5,175,095, USPN 5,420,034, USPN 5,188,958, and USPN 5,589,379. Where 
the host cell is a yeast, transcription and translational regions functional in yeast cells are 
provided, particularly from the host species. The transcriptional initiation regulatory regions can 
be obtained, for example firom genes in the glycolytic pathway, such as alcohol dehydrogenase, 
glyceraldehyde-3-phosphate dehydrogenase (GPD), phosphoglucoisomerase, phosphoglycerate 
kinase, etc. or regulatable genes such as acid phosphatase, lactase, metallothionein, 
glucoamylase, etc. Any one of a number of regulatory sequences can be used in a particular 
situation, depending upon whether constitutive or induced transcription is desired, the particular 
efficiency of the promoter in conjunction with the open-reading frame of interest, the ability to 
join a strong promoter with a control region from a different promoter which allows for 
inducible transcription, ease of construction, and the like. Of particular interest are promoters 
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which are activated in the presence of galactose. Galactose-inducible promoters (GAL I, GAL7, 
and GAL 10) have been extensively utilized for high level and regulated expression of protein in 
yeast (Lue et ah (1987) Mol Cell Biol 7:3446; Johnston, (1987) Microbiol Rev. 51:458). 
Transcription from the GAL promoters is activated by the GAL4 protein, which binds to the 
5 promoter region and activates transcription when galactose is present. In the absence of 
galactose, the antagonist GAL80 binds to GAL4 and prevents GAL4 from activating 
transcription. Addition of galactose prevents GAL80 from inhibiting activation by GAL4. 
Preferably, the termination region is derived from a yeast gene, particularly Saccharomyces, 
Schizosaccharomyces, Candida or Kluyveromyces, The 3' regions of two mammalian genes, y 

10 interferon and a2 interferon, are also known to function in yeast. 

Nucleotide sequences surrounding the translational initiation codon ATG have been 
found to affect expression in yeast cells. If the desired polypeptide is poorly expressed in yeast, 
the nucleotide sequences of exogenous genes can be modified to include an efficient yeast 
translation initiation sequence to obtain optimal gene expression. For expression in 

15 Saccharomyces, this can be done by site-directed mutagenesis of an inefficiently expressed gene 
by fiising it in-frame to an endogenous Saccharomyces gene, preferably a highly expressed gene, 
such as the lactase gene. 

As an alternative to expressing the PKS-like genes in the plant cell cytoplasm, is to target 
the enzymes to the chloroplast. One method to target proteins to the chloroplast entails use of 

20 leader peptides attached to the N-termini of the proteins. Commonly used leader peptides are 
derived from the small subunit of plant ribulose bis phosphate carboxylase. Leader sequences 
from other chloroplast proteins may also be used. Another method for targeting proteins to the 
chloroplast is to transform the chloroplast genome (Stable transformation of chloroplasts of 
Chlamydomonas reinhardtii (1 green alga) using bombardment of recipient cells with high- 

25 velocity tungsten microprojectiles coated with foreign DNA has been described. See, for 

example. Blowers et al Plant Cell (1989) 7:123-132 and Debuchy et al EMB0J(m9) 5:2803- 
2809. The transformation technique, using tungsten microprojectiles, is described by Kline et 
al. Nature (London) (1987) i27:70-73). The most common method of transforming chloroplasts 
involves using biolistic techniques, but other techniques developed for the purpose may also be 

30 used. (Methods for targeting foreign gene products into chloroplasts (Shrier et al EMBO J. 

(1985) ^:25-32) or mitochnodria (Boutry et al, supra) have been described. See also Tomai et al 
Gen, Biol Chem. (1988) 253:15104-15109 and US Patent No. 4,940,835 for the use of transit 
peptides for translocating nuclear gene products into the chloroplast. Methods for directing the 
transport of proteins to the chloroplast are reviewed in Kenauf TIBTECH i\9Zl) 5:40-47. 

35 For producing PUFAs in avian species and cells, gene transfer can be performed by 

introducing a nucleic acid sequence encoding a PUFA enzyme into the cells following 
procedures known in the art. If a transgenic animal is desired, pluripotent stem cells of embryos 
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can be provided with a vector carrying a PUFA enzyme encoding transgene and developed into 
adult animal (USPN 5,162,215; Ono et aL (1996) Comparative Biochemistry and Physiology A 
1 ;i(3):287-292; WO 9612793; WO 9606160). In most cases, the transgene is modified to 
express high levels of the PKS-like enzymes in order to increase production of PUFAs. The 
transgenes can be modified, for example, by providing transcriptional and/or translational 
regulatory regions that function in avian cells, such as promoters which direct expression in 
particular tissues and egg parts such as yolk. The gene regulatory regions can be obtained from 
a variety of sources, including chicken anemia or avian leukosis viruses or avian genes such as a 

chicken ovalbumin gene. 

Production of PUFAs in insect cells can be conducted using baculovirus expression 
vectors harboring PKS-like transgenes. Baculovirus expression vectors are available from 
several commercial sources such as Clonetech. Methods for producing hybrid and transgenic 
strains of algae, such as marine algae, which contain and express a desaturase transgene also are 
provided. For example, transgenic marine algae can be prepared as described in USPN 
5,426,040. As with the other expression systems described above, the timing, extent of 
expression and activity of the desaturase transgene can be regulated by fitting the polypeptide 
coding sequence with the appropriate transcriptional and translational regulatory regions 
selected for a particular use. Of particular interest are promoter regions which can be induced 
under preselected growth conditions. For example, introduction of temperature sensitive and/or 
metabolite responsive mutations into the desaturase transgene coding sequences, its regulatory 
regions, and/or the genome of cells into which the transgene is introduced can be used for this 
purpose. 

The transformed host cell is grown under appropriate conditions adapted for a desired 
end result. For host cells grown in culture, the conditions are typically optimized to produce the 
greatest or most economical yield of PUFAs, which relates to the selected desaturase activity. 
Media conditions which may be optimized include: carbon source, nitrogen source, addition of 
substrate, final concentration of added substrate, form of substrate added, aerobic or anaerobic 
growth, growth temperature, inducing agent, induction temperature, growth phase at induction, 
growth phase at harvest, pH, density, and maintenance of selection. Microorganisms such as 
yeast, for example, are preferably grown using selected media of interest, which include yeast 
peptone broth (YPD) and minimal media (contains amino acids, yeast nitrogen base, and 
ammonium sulfate, and lacks a component for selection, for example uracil). Desirably, 
substrates to be added are first dissolved in ethanol. Where necessary, expression of the 
polypeptide of interest may be induced, for example by including or adding galactose to induce 
expression from a GAL promoter. 

When increased expression of the PKS-like gene polypeptide in a host cell which 
expresses PUFA from a PKS-like system is desired, several methods can be employed. 
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Additional genes encoding the PKS-like gene polypeptide can be introduced into the host 
organism. Expression from the native PKS-like gene locus also can be increased through 
homologous recombination, for example by inserting a stronger promoter into the host genome 
to cause increased expression, by removing destabilizing sequences from either the mRNA or 
the encoded protein by deleting that information from the host genome, or by adding stabilizing 
sequences to the mRNA (see USPN 4,910,141 and USPN 5,500,365). Thus, the subject host 
will have at least have one copy of the expression construct and may have two or more, 
depending upon whether the gene is integrated into the genome, amplified, or is present on an 
extrachromosomal element having multiple copy numbers. Where the subject host is a yeast, 
four principal types of yeast plasmid vectors can be used: Yeast Integrating plasmids (Yips), 
Yeast Replicating plasmids (YRps), Yeast Centromere plasmids (YCps), and Yeast Episomal 
plasmids (YEps). Yips lack a yeast replication origin and must be propagated as integrated 
elements in the yeast genome. YRps have a chromosomally derived autonomously replicating 
sequence and are propagated as medium copy number (20 to 40), autonomously replicating, 
unstably segregating plasmids. YCps have both a replication origin and a centromere sequence 
and propagate as low copy number (10-20), autonomously replicating, stably segregating 
plasmids. YEps have an origin of replication from the yeast 2|am plasmid and are propagated as 
high copy number, autonomously replicating, irregularly segregating plasmids. The presence of 
the plasmids in yeast can be ensured by maintaining selection for a marker on the plasmid. Of 
particular interest are the yeast vectors pYES2 (a YEp plasmid available from Invitrogen, 
confers uracil prototrophy and a GALl galactose-inducible promoter for expression), and 
pYX424 (a YEp plasmid having a constitutive TPl promoter and conferring leucine 
prototrophy; (Alber and Kawasaki (1982). J. Mol. & Appl. Genetics 1: 419). 

The choice of a host cell is influenced in part by the desired PUFA profile of the 
transgenic cell, and the native profile of the host cell. Even where the host cell expresses PKS- 
like gene activity for one PUFA, expression of PKS-like genes of another PKS-like system can 
provide for production of a novel PUFA not produced by the host cell. In particular instances 
where expression of PKS-like gene activity is coupled with expression of an ORF 8 PKS-like 
gene of an organism which produces a different PUFA, it can be desirable that the host cell 
naturally have, or be mutated to have, low PKS-like gene activity for ORF 8. As an example, 
for production of EPA, the DNA sequence used encodes the polypeptide having PKS-like gene 
activity of an organism which produces EPA, while for production of DHA, the DNA sequences 
used are those from an organism which produces DHA. For use in a host cell which already 
expresses PKS-like gene activity it can be necessary to utilize an expression cassette which 
provides for overexpression of the desired PKS-like genes alone or with a construct to 
downregulate the activity of an existing ORF of the existing PKS-like system, such as by 
antisense or co-suppression. Similarly, a combination of ORFs derived from separate organisms 
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which produce the same or different PUFAs using PKS-like systems may be used. For instance, 
the ORF 8 of Vibrio directs the expression of DHA in a host cell, even when ORFs 3, 6, 7 and 9 
are from Shewanella, which produce EPA when coupled to ORF 8 of Shewanella, Therefore, for 
production of eicosapentanoic acid (EPA), the expression cassettes used generally include one or 

5 more cassettes which include ORFs 3, 6, 7, 8 and 9 from a PUFA-producing organism such as 
the marine bacterium Shewanella putrefaciens (for EPA production) or Vibrio marinus (for 
DHA production). ORF 8 can be used for induction of DHA production, and ORF 8 of Vibrio 
can be used in conjunction with ORFs 3, 6, 7 and 9 of Shewanella to produce DHA. The 
organization and numbering scheme of the ORFs identified in the Shewanella gene cluster are 

10 shown in Fig 1 A. Maps of several subclones referred to in this study are shown in Fig IB. For 
expression of a PKS-like gene polypeptide, transcriptional and translational initiation and 
termination regions functional in the host cell are operably linked to the DNA encoding the 
PKS-like gene polypeptide. 

Constructs comprising the PKS-like ORFs of interest can be introduced into a host cell 

15 by any of a variety of standard techniques, depending in part upon the type of host cell. These 
techniques include transfection, infection, holistic impact, electroporation, microinjection, 
scraping, or any other method which introduces the gene of interest into the host cell {see USPN 
4,743,548, USPN 4,795,855, USPN 5,068,193, USPN 5,188,958, USPN 5,463,174, 
USPN 5,565,346 and USPN 5,565,347). Methods of transformation which are used include 

20 lithium acetate transformation {Methods in Enzymology, (1991) 194:186-187). For convenience, 
a host cell which has been manipulated by any method to take up a DNA sequence or construct 
will be referred to as "transformed" or "recombinant" herein. The subject host will have at least 
have one copy of the expression construct and may have two or more, depending upon whether 
the gene is integrated into the genome, amplified, or is present on an extrachromosomal element 

25 having multiple copy numbers. 

For production of PUFAs, depending upon the host cell, the several polypeptides 
produced by pEPA, ORFs 3, 6, 7, 8 and 9, are introduced as individual expression constructs or 
can be combined into two or more cassettes which are introduced individually or co-transformed 
into a host cell. A standard transformation protocol is used. For plants, where less than all PKS- 

30 like genes required for PUFA synthesis have been inserted into a single plant, plants containing 
a complementing gene or genes can be crossed to obtain plants containing a full complement of 
PKS-like genes to synthesize a desired PUFA. 

The PKS-like-mediated production of PUFAs can be performed in either prokaryotic or 
eukaryotic host cells. The cells can be cultured or formed as part or all of a host organism 

35 including an animal. Viruses and bacteriophage also can be used with appropriate cells in the 
production of PUFAs, particularly for gene transfer, cellular targeting and selection. Any type of 
plant cell can be used for host cells, including dicotyledonous plants, monocotyledonous plants. 
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and cereals. Of particular interest are crop plants such as Brassica, Arabidopsis, soybean, com, 
and the like. Prokaryotic cells of interest include Eschericia, Baccillus, Lactobaccillus, 
cyanobacteria and the like. Eukaryotic cells include plant cells, mammalian cells such as those 
of lactating animals, avian cells such as of chickens, and other cells amenable to genetic 
5 manipulation including insect, fungal, and algae cells. Examples of host animals include mice, 
rats, rabbits, chickens, quail, turkeys, cattle, sheep, pigs, goats, yaks, etc., which are amenable to 
genetic manipulation and cloning for rapid expansion of a transgene expressing population. For 
animals, PKS-like transgenes can be adapted for expression in target organelles, tissues and 
body fluids through modification of the gene regulatory regions. Of particular interest is the 
10 production of PUFAs in the breast milk of the host animal. 

Examples of host microorganisms include Saccharomyces cerevisiae, Saccharomyces 
carlsbergensis, or other yeast such as Candida, Kluyveromyces or other fungi, for example, 
filamentous fungi such as Aspergillus, Neurospora, Penicillium, etc. Desirable characteristics of 
a host microorganism are, for example, that it is genetically well characterized, can be used for 
15 high level expression of the product using ultra-high density fermentation, and is on the GRAS 
(generally recognized as safe) list since the proposed end product is intended for ingestion by 
humans. Of particular interest is use of a yeast, more particularly baker's yeast (S. cerevisiae), as 
a cell host in the subject invention. Strains of particular interest are SC334 (Mat a pep4-3 prbl- 
1 122 ura3-52 leu2-3, 1 12 regl-501 gall; (Hovland et al (1989) Gene 83:57-64); BJ1995 (Yeast 
20 Genetic Stock Centre, 1021 Donner Laboratory, Berkeley, CA 94720), INVSCl (Mat a hiw3Al 
leu2 trp 1-289 ura3-52 (Invitrogen, 1600 Faraday Ave., Carlsbad, CA 92008) and INVSC2 (Mat 
ahis3A200ura3-167; (Invitrogen). Bacterial cells also may be used as hosts. This includes 
coli, which can be useful in fermentation processes. Alternatively, a host such as a 
Lactobacillus species can be used as a host for introducing the products of the PKS-like pathway 
25 into a product such as yogurt. 

The transformed host cell can be identified by selection for a marker contained on the 
introduced construct. Alternatively, a separate marker construct can be introduced with the 
desired construct, as many transformation techniques introduce multiple DNA molecules into 
host cells. Typically, transformed hosts are selected for their ability to grow on selective media. 
30 Selective media can incorporate an antibiotic or lack a factor necessary for growth of the 

untransformed host, such as a nutrient or growth factor. An introduced marker gene therefor 
may confer antibiotic resistance, or encode an essential growth factor or enzyme, and permit 
growth on selective media when expressed in the transformed host cell. Desirably, resistance to 
kanamycin and the amino glycoside G418 are of particular interest {see USPN 5,034,322). For 
35 yeast transformants, any marker that functions in yeast can be used, such as the ability to grow 
on media lacking uracil, lencine, lysine or tryptophan. 



wo 00/42195 



PCT/USOO/00956 



23 

Selection of a transformed host also can occur when the expressed marker protein can be 
detected, either directly or indirectly. The marker protein can be expressed alone or as a fusion 
to another protein. The marker protein can be one which is detected by its enzymatic activity; 
for example 6-galactosidase can convert the substrate X-gal to a colored product, and luciferase 
5 can convert luciferin to a light-emitting product. The marker protein can be one which is 
detected by its light-producing or modifying characteristics; for example, the green fluorescent 
protein of Aequorea victoria fluoresces when illuminated with blue light. Antibodies can be 
used to detect the marker protein or a molecular tag on, for example, a protein of interest. Cells 
expressing the marker protein or tag can be selected, for example, visually, or by techniques 
10 such as FACS or panning using antibodies. 

The PUFAs produced using the subject methods and compositions are found in the host 
plant tissue and/or plant part as free fatty acids and/or in conjugated forms such as acylglycerols, 
phospholipids, sulfolipids or glycolipids, and can be extracted from the host cell through a 
variety of means well-known in the art. Such means include extraction with organic solvents, 
1 5 sonication, supercritical fluid extraction using for example carbon dioxide, and physical means 
such as presses, or combinations thereof Of particular interest is extraction with methanol and 
chloroform. Where appropriate, the aqueous layer can be acidified to protonate negatively 
charged moieties and thereby increase partitioning of desired products into the organic layer. 
After extraction, the organic solvents can be removed by evaporation under a stream of nitrogen. 
20 When isolated in conjugated forms, the products are enzymatically or chemically cleaved to 
release the free fatty acid or a less complex conjugate of interest, and are then subjected to 
further manipulations to produce a desired end product. Desirably, conjugated forms of fatty 
acids are cleaved with potassium hydroxide. 

If further purification is necessary, standard methods can be employed. Such methods 
25 include extraction, treatment with urea, fractional crystallization, HPLC, fractional distillation, 
silica gel chromatography, high speed centrifugation or distillation, or combinations of these 
techniques. Protection of reactive groups, such as the acid or alkenyl groups, can be done at any 
step through known techniques, for example alkylation or iodination. Methods used include 
methylation of the fatty acids to produce methyl esters. Similarly, protecting groups can be 
30 removed at any step. Desirably, purification of fractions containing DHA and EPA is 
accomplished by treatment with urea and/or fractional distillation. 

The uses of the subject invention are several. Probes based on the DNAs of the present 
invention find use in methods for isolating related molecules or in methods to detect organisms 
expressing PKS-like genes. When used as probes, the DNAs or oligonucleotides need to be 
35 detectable. This is usually accomplished by attaching a label either at an internal site, for 

example via incorporation of a modified residue, or at the 5' or 3' terminus. Such labels can be 
directly detectable, can bind to a secondary molecule that is detectably labeled, or can bind to an 
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unlabelled secondary molecule and a detectably labeled tertiary molecule; this process can be 
extended as long as is practicable to achieve a satisfactorily detectable signal without 
unacceptable levels of background signal. Secondary, tertiary, or bridging systems can include 
use of antibodies directed against any other molecule, including labels or other antibodies, or can 
involve any molecules which bind to each other, for example a biotin-streptavidin/avidin 
system. Detectable labels typically include radioactive isotopes, molecules which chemically or 
enzymatically produce or alter light, enzymes which produce detectable reaction products, 
magnetic molecules, fluorescent molecules or molecules whose fluorescence or light-emitting 
characteristics change upon binding. Examples of labelling methods can be found in USPN 
5,011,770. Alternatively, the binding of target molecules can be directly detected by measunng 
the change in heat of solution on binding of a probe to a target via isothermal titration 
calorimetry, or by coating the probe or target on a surface and detecting the change in scattering 
of light from the surface produced by binding of a target or a probe, respectively, is done with 
the BIAcore system. 

PUFAs produced by recombinant means find applications in a wide variety of areas. 
Supplementation of humans or animals with PUFAs in various forms can result in increased 
levels not only of the added PUFAs, but of their metabolic progeny as well. Complex regulatory 
mechanisms can make it desirable to combine various PUFAs. or to add different conjugates of 
PUFAs, in order to prevent, control or overcome such mechanisms to achieve the desired levels 
of specific PUFAs in an individual. In the present case, expression of PKS-like gene genes, or 
antisense PKS-like gene transcripts, can alter the levels of specific PUFAs, or derivatives 
thereof, found in plant parts and/or plant tissues. The PKS-like gene polypeptide coding region 
is expressed either by itself or with other genes, in order to produce tissues and/or plant parts 
containing higher proportions of desired PUFAs or containing a PUFA composition which more 
closely resembles that of human breast milk (Prieto et ai, PCT publication WO 95/24494) than 
does the unmodified tissues and/or plant parts. 

PUFAs, or derivatives thereof, made by the disclosed method can be used as dietary 
supplements for patients undergoing intravenous feeding or for preventing or treating 
malnutrition. For dietary supplementation, the purified PUFAs, or derivatives thereof, can be 
incorporated into cooking oils, fats or margarines formulated so that in normal use the recipient 
receives a desired amount of PUFA. The PUFAs also can be incorporated into infant formulas, 
nutritional supplements or other food products, and find use as anti-inflammatory or cholesterol 
lowering agents. 

Particular fatty acids such as EPA can be used to alter the composition of infant formulas 
to better replicate the PUFA composition of human breast milk. The predominant triglyceride in 
human milk is reported to be 1 .3-di-oleoyl-2-palmitoyl, with 2-palmitoyl glycerides reported as 
better absorbed than 2-oleoyl or 2-lineoyl glycerides (see USPN 4.876,107). Typically, human 
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breast milk has a fatty acid profile comprising from about 0.15 % to about 0.36 % as DHA, from 
about 0.03 % to about 0.13 % as EPA, from about 0.30 % to about 0.88 % as ARA, from about 
0.22 % to about 0.67 % as DGLA, and from about 0.27 % to about 1 .04 % as GLA. A preferred 
ratio of GLA:DGLA:ARA in infant formulas is from about 1:1:4 to about 1:1:1, respectively. 

5 Amounts of oils providing these ratios of PUFA can be determined without undue 

experimentation by one of skill in the art. PUFAs, or host cells containing them, also can be 
used as animal food supplements to alter an animal's tissue or milk fatty acid composition to one 
more desirable for human or animal consumption. 

For pharmaceutical use (human or veterinary), the compositions generally are 

10 administered orally but can be administered by any route by which they may be successfully 
absorbed, e.g., parenterally (i.e. subcutaneously, intramuscularly or intravenously), rectally or 
vaginally or topically, for example, as a skin ointment or lotion. Where available, gelatin 
capsules are the preferred form of oral administration. Dietary supplementation as set forth 
above also can provide an oral route of administration. The unsaturated acids of the present 

15 invention can be administered in conjugated forms, or as salts, esters, amides or prodrugs of the 
fatty acids. Any pharmaceutically acceptable salt is encompassed by the present invention; 
especially preferred are the sodium, potassium or lithium salts. Also encompassed are the N- 
alkylpolyhydroxamine salts, such as N-methyl glucamine, described in PCT publication WO 
96/33155. Preferred esters are the ethyl esters. 

20 The PUFAs of the present invention can be administered alone or in combination with a 

pharmaceutically acceptable carrier or excipient. As solid salts, the PUFAs can also be 
administered in tablet form. For intravenous administration, the PUFAs or derivatives thereof 
can be incorporated into commercial formulations such as Intralipids. Where desired, the 
individual components of formulations can be individually provided in kit form, for single or 

25 multiple use. A typical dosage of a particular fatty acid is from 0.1 mg to 20 g, or even 100 g 
daily, and is preferably from 10 mg to 1, 2, 5 or 10 g daily as required, or molar equivalent 
amounts of derivative forms thereof. Parenteral nutrition compositions comprising from about 2 
to about 30 weight percent fatty acids calculated as triglycerides are encompassed by the present 
invention. Other vitamins, and particularly fat-soluble vitamins such as vitamin A, D, E and L- 

30 carnitine optionally can be included. Where desired, a preservative such as a tocopherol can be 
added, typically at about 0.1% by weight. 

The following examples are presented by way of illustration, not of limitation. 
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EXAMPLES 
Example 1 

The Identity of ORFs Derived from Vibrio marinus 

5 Using polymerase chain reaction (PCR) with primers based on ORF 6 of Shewanella (Sp 

ORF 6) sequences (FW 5' primers CUACUACUACUACCAAGCT 

AAAGCACTTAACCGTG, SEQ ID N0:41, and CUACUACUACUAACAGCGAAATG 

CTTATCAAG, SEQ ID NO:42, for Vibrio and SS9 respectively and 3' BW primers: 
CAUCAUCAUCAUGCGACCAAAACCAAATGAGCTAATAC, SEQ ID NO:43, for both 
10 Vibrio and SS9) and genomic DNAs templates from Vibrio and a borophyllic photobacter 
producing EPA (provided by Dr. Bartlett, UC San Diego), resulted in PCR products of ca.400 
bases for Vibrio marinus {Vibrio) and cfl.900 bases for SS9 presenting more than 75% homology 
with corresponding fragments of Sp ORF 6 {see Figure 25) as determined by direct counting of 
homologous amino acids. 

15 A Vibrio cosmid library was then prepared and using the Vibrio ORF 6 PCR product as a 

probe {see Figure 26); clones containing at least ORF 6 were selected by colony hybridization. 

Through additional sequences of the selected cosmids such as cosmid #9 and cosmid 
#21, a Vibrio cluster (Figure 5) with ORFs homologous to, and organized in the same sequential 
order (ORFs 6-9) as ORFs 6-9 of Shewanella, was obtained (Figure 7). The Vibrio ORFs from 
20 this sequence are found at 17394 to 361 15 and comprehend ORFs 6-9. 

Table 
Vibrio operon figures 

17394 to 25349 length = 7956 nt 

25 25509 to 28157 length - 2649 nt 

28209 to 34262 length = 6054 nt 

34454 to 3611 5 length = 1662 nt 

The ORF designations for the Shewanella genes are based on those disclosed in Figure 4, and 
30 differ from those published for the Shewanella cluster (Yazawa et al, USPN 5,683,898). For 
instance, ORF 3 of Figure 4 is read in the opposite direction from the other ORFs and is not 
disclosed in Yazawa et al USPN 5,683,898 (See Fig. 24) for comparison with Yazawa et al 
USPN 5,683,898. 

Sequences homologous to ORF 3, were not found in the proximity of ORF 6 (17000 
35 bases upstream of ORF 6) or of ORF 9 (ca4000 bases downstream of ORF 9). Motifs 

characteristic of phosphopantethenyl transferases (Lambalot et al (1996) Current Biology 3:923- 
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936) were absent from the Vibrio sequences screened for these motifs. In addition, there was no 
match to Sp ORP 3 derived probes in genomic digests of Vibrio and of SC2A Shewanella 
(another bacterium provided by the University of San Diego and also capable of producing 
EPA). Although ORF 3 may exist in Vibrio, its DNA may not be homologous to that of Sp 
ORF 3 and/or could be located in portions of the genome that were not sequenced. 

Figure 6 provides the sequence of an approximately 19 kb Vibrio clone comprising ORFs 
6-9. Figures 7 and 8 compare the gene cluster organizations of the PKS-like systems of Vibrio 
marinus and Shewanella putrefacians. Figures 9 through 12 show the levels of sequence 
homology between the corresponding ORFs 6, 7, 8 and 9, respectively. 

Example 2 
ORF 8 Directs DHA Production 

As described in example 1, DNA homologous to Sp ORF 6 was found in an unrelated 
species, SS9 Photobacter, which also is capable of producing EPA. Additionally, ORFs 
homologous to Sp ORF 6-9 were found in the DHA producing Vbrio marinus {Vibrio). From 
these ORFs a series of experiments was designed in which deletions in each of 5;p ORFs 6-9 that 
suppressed EPA synthesis in E. coli (Yazawa (1996) supra) were complemented by the 
corresponding homologous genes from Vibrio. 

The Sp EPA cluster was used to determine if any of the Vibrio ORFs 6-9 was responsible 
for the production of DHA. Deletion mutants provided for each of the Sp ORFs are EPA and 
DHA null. Each deletion was then complemented by the corresponding Vibrio ORF expressed 
behind a lac promoter (Figure 13). 

The complementation of a S/j ORF 6 deletion by a Vibrio ORF 6 reestablished the 
production of EPA. Similar results were obtained by complementing the Sp ORF 7 and ORF 9 
deletions. By contrast, the complementation of a 5/7 ORF 8 deletion resulted in the production 
of C22:6. Vibrio ORF 8 therefore appears to be a key element in the synthesis of DHA. 
Figures 14 and 15 show chromatograms of fatty acid profiles from the respective 
complementations of Sp del ORF 6 with Vibrio ORF 6 (EPA and no DHA) and Sp del ORF 8 
with Vibrio ORF 8 (DHA). Figure 1 6 shows the fatty acid percentages for the ORF 8 
complementation, again demonstrating that ORF 8 is responsible for DHA production. 

These data show that polyketide-like synthesis genes with related or similar ORFs can be 
combined and expressed in a heterologous system and used to produce a distinct PUFA species 
in the host system, and that ORF 8 has a role in determining the ultimate chain length. The 
Vibrio ORFs 6, 7, 8, and 9 reestablish EPA synthesis. In the case of Vibrio ORF 8, DHA is also 
present (ca. 0.7%) along with EPA {ca. 0.6%) indicating that this gene plays a significant role in 
directing synthesis of DHA EPA for these systems. 
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Example 3 
Requirements for Production of DHA 

To determine how Vibrio ORFs of the cluster ORF 6-9 are used in combination with 
Vibrio ORF 8, some combinations of Vibrio ORF 8 with some or all of the other Vibrio ORFS 6- 
9 cluster were created to explain the synthesis of DHA. 

Vibrio ORFs 6-9 were complemented with Sp ORF 3. The results of this 
complementation are presented in Figures 16b and 16c. The significant amounts of DHA 
measured (greater than about 9%) and the absence of EPA suggest that no ORFs other than those 
of Vibrio ORFs 6-9 are required for DHA synthesis when combined with Sp ORF 3. This 
suggests that Sp ORF 3 plays a general function in the synthesis of bacterial PUFAs. 

With respect to the DHA vs EPA production, it may be necessary to combine Vibrio 
ORF 8 with other Vibrio ORFs of the 6-9 cluster in order to specifically produce DHA. The 
roles of Vibrio ORF 9 and each of the combinations of Vibrio ORFs (6,8), (7, 8), (8, 9), etc in 
the synthesis of DHA are being studied. 

Example 4 
Plant Expression Constructs 

A cloning vector with very few restriction sites was designed to facilitate the cloning of 
large fragments and their subsequent manipulation. An adapter was assembled by annealing 
oligonucleotides with the sequences AAGCCCGGGCTT, SEQ ID NO:44, and 
GTACAAGCCCGGGCTTAGCT, SEQ ID NO:45. This adapter was ligated to the vector 
pBluescript II SK+ (Stratagene) after digestion of the vector with the restriction endonucleases 
Aspin and Sstl The resulting vector, pCGN7769 had a single Srjl (and embedded Smal) 
cloning site for the cloning of blunt ended DNA fragments. 

A plasmid containing the napin cassette from pCGN3223, (USPN 5,639,790) was 
modified to make it more useful for cloning large DNA fragments containing multiple restriction 
sites, and to allow the cloning of multiple napin fusion genes into plant binary transformation 
vectors. An adapter comprised of the self annealed oligonucleotide of sequence 
CGCGATTTAAATGGCGCGCCCTGCAGGCGGCCGCCTGCAGGGCGC 
GCCATTTAAAT, SEQ ID NO:46, was ligated into the vector pBC SK+ (Stratagene) after 
digestion of the vector with the restriction endonuclease BssUll to construct vector pCGN7765. 
Plamids pCGN3223 and pCGN7765 were digested with Notl and ligated together. The 
resultant vector, pCGN7770 (Figure 17), contains the pCGN7765 backbone and the napin seed 
specific expression cassette from pCGN3223. 

Shewanella constructs 
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Genes encoding the Shewanella proteins were mutagenized to introduce suitable cloning 
sites 5' and 3' ORFs using PGR. The template for the PGR reactions was DNA of the cosmid 
pEPA (Yazawa et al, supra). PGR reactions were performed using Pfii DNA polymerase 
according to the manufacturers' protocols. The PGR products were cloned into Srfi digested 
pGGN7769. The primers GTGCAGGTCGAGAGAATGTTGATT 
TGGTTATAGTTGTGTGC, SEQ ID NO:47, and GGATGGAGATGTGTAGGTAGTG 
TTAGGTGAAGGTGGA, SEQ ID NO:48, were used to amplify ORF 3, and to generate 
plasmid pGGN8520. The primers TGTAGAGTGGAGAGAATGAGGGAGAGGTG 
TAAAGCTAGA, SEQ ID NO:49, and GGGGGGGTGGAGGTAATTGGGGTGAGTGTG 
GTTTGGT, SEQ ID NO:50, were used to amplify ORF 6, and generate plasmid pCGN7776. 
The primers GAATTGGTGGAGAGAATGGGGGTGGGGATGG 
GAGTTATG, SEQ ID NO: 51, and GGTAGGAGATGTTTAGAGTTGGGGTTGAAG 
TAAATGG, SEQ ID NO:52, were used to amplify ORF 7, and generate plasmid pGGN7771. 
The primers GAATTGGTGGAGAGAATGTGATTAGGAGAGAATGG 
TTGT, SEQ ID NO;53, and TGTAGAGTGGAGTTATAGAGATTGTTGGATGGT 
GATAG, SEQ ID NO:54, were used to amplify ORF 8, and generate plasmid pCGN7775. The 
primers GAATTGGTGGAGAGAATGAATGCTAGAGCAAGTAACGAA, SEQ ID NO:55, and 
TCTAGAGGATGGTTAGGGGATTGTTTGGTTTGGGTTG, SEQ ID NO:56, were used to 
amplify ORF 9, and generate plasmid pGGN7773. 

The integrity of the PGR products was verified by DNA sequencing of the inserts of 
pGGN7771, PCGN8520, and pGGN7773. ORF 6 and ORF 8 were quite large in size. In order 
to avoid sequencing the entire clones, the center portions of the ORFs were replaced with 
restriction fragments of pEPA. The 6.6 kilobase PacVBamHl fragment of pEPA containing the 
central portion of ORF 6 was ligated into PacVBamHl digested pGGN7776 to yield 
pCGN7776B4. The 4.4 kilobase BamHl/Bglll fragment of pEPA containing the central portion 
of ORF 8 was ligated into BamUVBglll digested pGGN7775 to yield pGGN7775A. The regions 
flanking the pEPA fragment and the cloning junctions were verified by DNA sequencing. 

Plasmid pGGN7771 was cut with Xhol and 5g/II and ligated to pGGN7770 after 
digestion with SaR and Bglll. The resultant napin/ORF 7 gene fusion plasmid was designated 
pGGN7783. Plasmid pGGN8520 was cut with^ol and Bglll and ligated to pGGN7770 after 
digestion with Sail and BgHl. The resultant napin/ORF 3 gene fusion plasmid was designated 
pCGN8528. Plasmid pGGN7773 was cut with Sail and BamHl and ligated to pGGN7770 after 
digestion with SaR and BgllL The resultant napin/ORF 9 gene fusion plasmid was designated 
pGGN7785. Plasmid pGGN7775A was cut with SaR and ligated to pCGN7770 after digestion 
with SaR. The resultant napin/ORF 8 gene fusion plasmid was designated pGGN7782. Plasmid 
pGGN7776B4 was cut With Xhol and ligated to pGGN7770 after digestion with SaR. The 
resultant napin/ORF 6 gene fiision plasmid was designated pGGN7786B4. 
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A binary vector for plant transformation, pCGN5139, was constructed from pCGN1558 
(McBride and Summerfelt (1990) Plant Molecular Biology, 14:269-276). The polylinker of 
pCGNl 558 was replaced as a HindlWIAsplX 8 fragment with a polylinker containing unique 
restriction endonuclease sites, Ascl, Pad, Xbal, Swal, BamUl, and Noil. The ^5^718 and 

5 HinWl restriction endonuclease sites are retained in pCGN5 1 39. PCGN5 1 39 was digested 
with Noil and ligated with Natl digested pCGN7786B4. The resultant binary vector containing 
the napin/ORF 6 gene fusion was designated pCGN8533. Plasmid pCGN8533 was digested 
with &e8387I and ligated with 5^e8387I digested pCGN7782. The resultant binary vector 
containing the napin/ORF 6 gene fusion and the napin/ORF 8 gene fusion was designated 

10 pCGN8535 (Figure 18). 

The plant binary transformation vector, pCGN5 139, was digested with Aspl\% and 
ligated with ^5/77 18 digested pCGN8528. The resultant binary vector containing the 
napin/ORF 3 gene fusion was designated pCGN8532. Plasmid pCGN8532 was digested with 
Natl and ligated with Natl digested pCGN7783. The resultant binary vector containing the 

15 napin/ORF 3 gene fusion and the napin/ORF 7 gene fusion was designated pCGN8534. Plasmid 
pCGN8534 was digested with &e8387I and ligated with &e8387I digested pCGN7785. The 
resultant binary vector containing the napin/ORF 3 gene fusion, the napin/ORF 7 gene fusion 
and the napin/ORF 9 gene fusion was designated pCGN8537 (Figure 19). 

20 Vibrio constructs 

The Vibrio ORFs for plant expression were all obtained using Vibrio cosmid #9 as a 
starting molecule. Vibrio cosmid #9 was one of the cosmids isolated from the Vibrio cosmid 
library using the Vibrio ORF 6 PGR product described in Example 1. 

A gene encoding Vibrio ORF 7 (Figure 6) was mutagenized to introduce a SaR site 

25 upstream of the open reading frame and BamYil site downstream of the open reading frame using 
the PGR primers: TGTAGAGTCGACACAATGGCGGAATTAGCTG 
TTATTGGT, SEQ ID NO:57, and GTCGACGGATCCCTATTTGTTCGTGTTTGCTA 
TATG, SEQ ID NO:58. A gene encoding Vibrio ORF 9 (Figure 6) was mutagenized to 
introduce a BamUl site upstream of the open reading frame and an XhoYil site dovmstream of 

30 the open reading frame using the PGR primers: GTGGAGGGATCGA 

GAATGAATATAGTAAGTAATGATTGGGGA, SEQ ID NO:59, and GTGGACCTG 
GAGTTAATCAGTCGTAGGATAAGTTGGG, SEQ ID NO:60. The restriction sites were 
introduced using PGR, and the integrity of the mutagenized plasmids was verified by DNA 
sequence. The Vibrio ORF 7 gene was cloned as a Sah-BamUl fragment into the napin cassette 

35 ofSal'Bgn digested pGGN7770 (Figure 1 7) to yield pGGN8539. The Vibrio ORF 9 gene was 
cloned as a Sall-BamHl fragment into the napin cassette of Sal-Ball digested pGGN7770 (Figure 
17) to yield pCGN8543. 
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Genes encoding the Vibrio ORF 6 and ORF 8 were mutagenized to introduce Sail sites 
flanking the open reading frames. The Sail sites flanking ORF 6 were introduced using PGR. 
The primers used were: CCGGGGTCGAGACAATGGCTAAAAAGAAGA 
CCACATGGA, SEQ IDN0:61, and GCGGGGTGGAGTCATGAGATATCGTTCAAA 
ATGTGAGTGA, SEQ ID NO:62. The central 7.3 kb Bamm-Xhol fragment of the PGR product 
was replaced with the corresponding fragment from Vibrio cosmid #9. The mutagenized ORF 6 
were cloned into the Sail site of the napin cassette of pCGN7770 to yield plasmid pCGN8554. 

The mutagenesis of ORF 8 used a different strategy. A BamUl fragment containing ORF 
8 was subcloned into plasmid pHC79 to yield cosmid #9". A Sail site upstream of the coding 
region was introduced on and adapter comprised of the oligonucleotides 
TCGAGATGGAAAATATTGGAGTAGTAGGTATTGGTAATTT 
GITC, SEQ ID NO:63, and GGGGGAACAAATTAGGAATACGTAGTAGTGGAAT 
ATTTTGGATG, SEQ ID NO:64. The adapter was ligated to cosmid #9" after digestion with 
San mdXmal. A Sail site was introduced downstream of the stop codon by using PGR for 
mutagenesis. A DNA fragment containing the stop codon was generated using cosmid #9" as a 
template with the primers TGAGATGAAGTTTATCGATAC, SEQ ID NO:65 and 
TCATGAGAGGTCGTGGACTTACGGTTCAAGAATACT, SEQ ID NO:66. The PGR product 
was digested with the restriction endonucleases CM and Aatll and was cloned into the cosmid 
9" derivative digested with the same enzymes to yield plasmid 8P3. The Sail firagment from 
8P3 was cloned into SaR digested pGGN7770 to yield pGGN8515. 

PGGN8532, a binary plant transformation vector that contains a Shewannella ORF 3 
under control of the napin promoter was digested with Notl, and a Notl fragment of pGGN8539 
containing a napin Vibrio ORF 7 gene fusion was inserted to yield pCGN8552. Plasmid 
pGGN8556 (Figure 23), which contains Shewannella ORF 3, and Vibrio ORFs 7 and 9 under 
control of the napin promoter was constructed by cloning the SseS357 fragment from 
pGGN8543 into 5se8387 digested pGGN8552. 

The Notl digested napin/ORF 8 gene from plasmid pGGN85 1 5 was cloned into a Notl 
digested plant binary transformation vector pGGN5139 to yield pGGN8548. The &e8387 
digested napin/ORF 6 gene from pCGN8554 was subsequently cloned into the Ssemi site of 
pGGN8566. The resultant binary vector containing the napin/ORF 6 gene fusion and 
napm/ORF 8 gene fusion was designated pGGN8560 (Figure 22). 



Example 5 

Plant Transformation and PUFA Production 



EPA production 
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The Shewanella constructs pCGN8535 and pCGN8537 can be transformed into the same 
or separate plants. If separate plants are used, the transgenic plants can be crossed resulting in 
heterozygous seed which contains both constructs. 

pCGN8535 and pCGN8537 are separately transformed into Brassica napus. Plants are 
5 selected on media containing kanamycin and transformation by full length inserts of the 
constructs is verified by Southern analysis. Immature seeds also can be tested for protein 
expression of the enzyme encoded by ORFs 3, 6, 7, 8, or 9 using western analysis, in which 
case, the best expressing pCGNE8535 and pCGN8537 Ti transformed plants are chosen and are 
grown out for further experimentation and crossing. Alternatively, the Ti transformed plants 
10 showing insertion by Southern are crossed to one another producing T2 seed which has both 
insertions. In this seed, half seeds may be analyzed directly from expression of EPA in the fatty 
acid fraction. Remaining half-seed of events with the best EPA production are grown out and 
developed through conventional breeding techniques to provide Brassica lines for production of 
EPA. 

15 Plasmids pCGN7792 and pCGN7795 also are simultaneously introduced into Brassica 

napus host cells. A standard transformation protocol is used {see for example USPN 5,463,174 
and USPN 5,750,871, hov^cvcr Agrobacteria containing both plasmids are mixed together and 
incubated with Brassica cotyledons during the cocuhivation step. Many of the resultant plants 
are transformed with both plasmids. 

20 

DHA production 

A plant is transformed for production of DHA by introducing pCGN8556 and 
pCGN8560, either into separate plants or simultaneously into the same plants as described for 
EPA production. 

25 Alternatively, the Shewanella ORFs can be used in a concerted fashion with ORFs 6 and 

8 of Vibrio, such as by transforming with a plant the constructs pCGN8560 and pCGN7795, 
allowing expression of the corresponding ORFs in a plant cell. This combination provides a 
PKS-like gene arrangement comprising ORFs 3, 7 and 9 of Shewanella, with an ORF 6 derived 
from Vibrio and also an OFR 8 derived from Vibrio. As described above, ORF 8 is the PKS-like 

30 gene which controls the identity of the final PUFA product. Thus, the resulting transformed 
plants produce DHA in plant oil. 

Example 6 

Transgenic plants containing the Shewanella PUFA genes 

35 Brassica plants 
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Fifty-two plants cotransformed with plasmids pCGN8535 andpCGN8537 were analyzed 
using PGR to determine if the Shewanella ORFs were present in the transgenic plants. Forty- 
one plants contained plasmid pCGN8537, and thirty-five plants contained pCGN8535. 1 1 of the 
plants contained all five ORFs required for the synthesis of EPA. Several plants contained genes 
5 from both of the binary plasmids but appeared to be missing at least one of the ORFs. Analysis 
is currently being performed on approximately twenty additional plants. 

Twenty-three plants transformed with pCGN8535 alone were analyzed using PGR to 
determine if the Shewanella ORFs were present in the transgenic plants. Thirteen of these plants 
contained both Shewanella ORF 6 and Shewanella ORF 8. Six of the plants contained only one 
10 ORF. 

Nineteen plants transformed with pGGN8537 were alone analyzed using PGR to 
determine if the Shewanella ORFs were present in the transgenic plants. Eighteen of the plants 
contained Shewanella ORF 3, Shewanella ORF 7, and Shewanella ORF 9. One plant contained 
Shewanella ORFs 3 and 7. 

15 

Arabidopsis 

More than 40 transgenic Arabidopsis plants cotransformed with plasmids pCGN8535 
and pGGN8537 are growing in our growth chambers. PGR analysis to determine which of the 
ORFs are present in the plants is currently underway. 

20 

Example 7 

Evidence of A PKS System of PUFA Synthesis In Schizochytrium 
The purpose of this experiment was to identify additional sources of PKS genes. 
Polyunsaturated long chain fatty acids were identified in Schizochytrium oil. Furthermore, 

25 production of polyunsaturated fatty acids was detected in a culture of Schizochytrium. A fi-eshly 
diluted culture of Schizochytrium was incubated at 24**G in the presence of [^'^GJ-acetate 
(5uGi/mL) for 30 min with shaking (150 rpm). The cells were then collected by centrifugation, 
lyophilized and subjected to a transesterification protocol that involved heating to 90°G for 90 
minutes in the presence of acidic (9% H2SO4) methanol with toluene (1 volume of toluene per 

30 two volumes of acidic methanol) as a second solvent. The resulting methylesters were extracted 
with an organic solvent (hexane) and separated by TLG (silica gel G, developed three times with 
hexaneidiethyl ether (19:1)). Radioactivity on the TLG plate was detected using a scanner 
(AMBIS). Two prominent bands were detected on the TLG plate. These bands migrated on the 
TLG plate in positions expected for short chain (14 to 16 carbon), saturated methyl esters (the 

35 upper band) and with methylesters of polyunsaturated long chain (20 to 22 carbon) fatty acids 
(the lower band). These were also the major types of fatty acids detected by GG analysis of 
FAMEs of Schizochytrium oil. 
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I„ a parallel exp=ri«n. ftiolacomycin. a well known inhibilo, of Type 11 feny aad 
s^JZLs ^ well a. several polykeUde synUresis syslen. Including EPA producon by 
^"Led wUh PKS genes derived from was added to .he .es. tubes f 

va; g— .ions (0, 1, 10 and 100 pg/ml) p,ior.o addi.ion of .he 
clr^ and ['"0 ace.a.e. Analysis of i„corpo,a.ion of ["C] aceute. as descnbed bove 
ret^^l 100 ug/n,L .hloiae.on,ycin complelely bloeked syn.hesis of poly— d ft«y 

lie pa«iaH„hibi.ion of syn.hesis of poly»nsa»a«d fa,.y acids was obs^e^ a. 0 
uXL syndesis of .he sho« chain sa«a.ed fa..y ac.^ was u^e«^d a. all 

tested ftioIac«>mycin concentnttions. Thiolactomycin does not tntab,. Type I fatty ac,d 
sXt^s.en,sa„dis„o.toxic.o»ice,suggestingthatitd„es„otinhiWU,ee,o^^^^^^^ 

y^tri ading to EPA or DHA fonnatfon. Fu«he,more. *iolaComycin d,d no. ■nh.b.t fl,e 

eC> n syL leading to PUFA synthesis in .W.c<./»» '"-«"'"»■• 
ru,h»,n-iskn„wn.opossessaTypeIfet.y acid synthests system 

„;°«ted that the polyunsamrated fatty acids produced in *is orgamsm were ^ved tom a 
ZZ 2^ was disL. from the Type I fany acid syndesis sys«n, which produce sho« 
Srfany acids, and fro„asys.em.ha.«assunilar«,ftee,ongation/desa.un.Uonpathwa^^^^ 
CVLLea.;dPW.cryta. The dau are consistent withDHAforntauonbctngaresult 

of a PKS padtway as found in Vibrio marlnus and Shewanella pu,nfac.ens. 

Example 8 

pv<; p plated Sequence'' ^rnm Schizochytrium 
The purpose of ftis experimen. was .0 idenUfy sequences from ScU^oc^iriun, that 
encoded PK^nes. A cDNA library from .cteoc*„W„. was cons.ruc.ed an app— 
8.0CO random clones (ESTs) were sequenced. The protem s«,uence '^-^ '^.t^^Ziu. 
EPA synthesis genes was compared to flte prediced amino acd sequences "f 
ESTS using a Smi«Wa.erman alig^nen. algorithm. When fte protem sequence of OFTe 

J.) was comp^ed with the amino acid sequences '-f ^^^^"f 
lones showed a signifleant degree of identity (P<0.01,. men dte P"'- — 
was compared by Scfeocfty,r»». ESTs, 4 EST clones showed stgmfican, ■'^"'^ > 
, suggesting that the molecules were homologous. When the protem sequence of 0RF8»d 
o59 were compared with the Scl.i.ccHy,nu. ESTs, 7 and 14 clones respectrvely showed 
significant identity (P<0.01). 

Example 9 
Analysis of Schizochytrium c DNA Clones 
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Restriction enzyme analysis of the Schizochytrium EST clones was used to determine the 
longest clones, which were subsequently sequenced in their entirety. All of the EST sequences 
described in Example 8 were determined to be part of 5 cDNA clones. 
Two of the cDNA clones were homologous to Shewanella 0RF6. LIB3033-047-B5 was 
5 homologous to the C-terminus of 0RF6. The sequence of LIB3033-047-B5 could be aligned 
with Shewanella ORF6 from amino acids 2093 onwards. The open reading frame of LIB3033- 
047-B5 extended all the way to the 5' end of the sequence, thus this clone was not likely to be 
full length. LIB3033-046-E6 shared homology to the ACP domain of 0RF6. It contained 6 
ACP repeats. This cDNA clone did not have a poly-A-tail, and therefore, it was likely to be a 

10 partial cDNA with additional regions of the cDNA found downstream of the sequence. The 
PGR primers GTGATGATCTTTCCCTGATGCACGCCAAGG (SEQ ID NO: 67) and 
AGCTCGAGACCGGCAACCCGCAGCGCCAGA (SEQ ID NO: 68) were used to amplify a 
fragment of approximately 500 nucleotides from Schizochytrium genomic DNA. Primer 
GTGATGATCTTTCCCTGATGCACGCCAAGG was derived from LIB3033-046-E6, and 

1 5 primer AGCTCGAGACCGGCAACCCGCAGCGCCAGA was derived from LIB3033-047-B5. 
Thus, LIB3033-046-E6 and LIB3033-047-B5 represented different portions of the same mRNA 
(see Figure 28) and could be assembled into a single partial cDNA sequence (see Figure 27A), 
SEQ ID NO: 69, that was predicted to encode a protein with the sequence in Figure 29A (SEQ 
ID NO: 70). The open reading frame extended all the way to the 5' end of the sequence, thus this 

20 partial cDNA was not likely to be full length. Analysis of additional cDNA or genomic clones 
will allow the determination of the full extent of the mRNA represented by clones LIB3033-046- 
E6 and LIB3033-047-B5. It may contain condensing en2yme related domains similar to those 
found near the N-terminus of Shewanella 0RF6. 

One of the cDNA clones, LIB3033-046-D2, was homologous to Shewanella ORF9 at its 

25 3' end. This clone was homologous to the chain length factor region of Shewanella 0RF8 at its 
5' end. This clone was also homologous to the entire open reading frame of the Anabaena HglC 
ORF. Ih^ Anabaena HglC ORF is homologous to the chain length factor region of Shewanella 
ORF8 and Shewanella ORF7. Thus this cDNA (Figure 27B), SEQ ID NO: 71, was homologous 
to part of Shewanella 0RF8, Shewanella 0RF7 and Shewanella 0RF9 (see Figure 28). The 

30 amino acid sequence (Figure 29B), SEQ ID NO: 72, encoded by the open reading frame of 
LIB3033-046-D2 extended all the way to the 5' end of the sequence; thus this clone was not 
likely to be full length. Analysis of additional cDNA or genomic clones will allow the 
determination of the fiill extent of the mRNA represented by LIB3033-046-E6. It may contain 
condensing enzyme related domains similar to those found near the N-terminus oi Shewanella 

35 0RF8. 

Two additional cDNA clones were homologous to Shewanella 0RF8. LIB81-015-D5 
was homologous to the C-terminus of 0RF8. The 5' sequence of LIB81-015-D5 could be 
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aligned with Shewanella 0RF8 from amino acids 1900 onwards. The 3' end of LIB81-015-D5 
could be aligned with Shewanella 0RF9 (see Figure 28). The amino acid sequence (Figure 
29C), SEQ ID NO: 73, encoded by the open reading frame of LIBS 1-01 5-D5 extended all the 
way to the 5' end of the sequence; thus this clone was not likely to be full length. LIB81-042- 
B9 was homologous to amino acids 1 150 to 1850 of Shewanella 0RF8. LIB81-042-B9 did not 
have a poly-A-tail, and therefore, it was likely to be a partial cDNA with additional regions of 
the cDNA found downstream of the sequence. The PGR primers 
TACCGCGGCAAGACTATCCGCAACGTCACC (SEQ ID NO: 74) and 
GCCGTCGTGGGCGTCCACGGACACGATGTG (SEQ ID NO: 75) were used to amplify a 
fragment of approximately 500 nucleotides from Schizochytrium genomic DNA. Primer 
TACCGCGGCAAGACTATCCGCAACGTCACC was derived from LIB81-042-B9, and 
primer GCCGTCGTGGGCGTCCACGGACACGATGTG was derived from LIB8I-015-D5. 
Thus, LIB81-042-and LIB81-015-D5 represented different portions of the same mRNA and 
were assembled into a single partial cDNA sequence (see Figure 27C), SEQ ID NO: 76. The 
open reading frame of LIB81-042-B9 also extended all the way to the 5' end of the sequence, 
thus this clone was also not likely to be full length. Analysis of additional cDNA or genomic 
clones will allow the determination of the full extent of the mRNA represented by LIB81-042- 
B9. 

By the present invention PKS-like genes from various organisms can now be used to 
transform plant cells and modify the fatty acid compositions of plant cell membranes or plant 
seed oils through the biosynthesis of PUFAs in the transformed plant cells. Due to the nature of 
the PKS-like systems, fatty acid end-products produced in the plant cells can be selected or 
designed to contain a number of specific chemical structures. For example, the fatty acids can 
comprise the following variants: Variations in the numbers of keto or hydroxyl groups at various 
positions along the carbon chain; variations in the numbers and types (cis or trans) of double 
bonds; variations in the numbers and types of branches off of the linear carbon chain (methyl, 
ethyl, or longer branched moieties); and variations in saturated carbons. In addition, the 
particular length of the end-product fatty acid can be controlled by the particular PKS-like genes 
utilized. 

All publications and patent applications mentioned in this specification are indicative of 
the level of skill of those skilled in the art to which this invention pertains. All publications and 
patent applications are herein incorporated by reference to the same extent as if each individual 
publication or patent application was specifically and individually indicated to be incorporated 
by reference. 
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The invention now being fully described, it will be apparent to one of ordinary skill in 
the art that many changes and modifications can be made thereto without departing from the 
spirit or scope of the appended claims. 
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What is claimed is: 

1 . An isolated nucleic acid comprising: 

a Vibrio marinus nucleotide sequence selected from the group consisting of ORF 6 (SEQ 
ID NO:77), ORF 7 (SEQ ID NO:78), ORF 8 (SEQ ID NO:79), and ORF 9 (SEQ ID NO:80), as 
5 shown in Figure 6. 

2. An isolated nucleic acid comprising: 

a nucleotide sequence which encodes a polypeptide of a polyketide-like synthesis 
system, wherein said system produces a docosahexenoic acid when expressed in a host cell. 

10 

3. The isolated nucleic acid according to Claim 2, wherein said nucleotide sequence is 
derived from a marine bacterium. 

4. An isolated nucleic acid according to Claim 2, wherein said nucleotide sequence is 
1 5 derived from Schizochytrium, 

5. The isolated nucleic acid according to Claim 2, wherein said nucleotide sequence is a 
Vibrio marinus ORF 8 (SEQ ID NO:79), as shown in Figure 6. 

20 6. An isolated nucleic acid comprising a Schizochytrium nucleotide sequence comprising a 
sequence shown in a SEQ ID NO selected from the group consisting of SEQ ID NOS: 69, 71 
and 76. 

7. An isolated nucleic acid comprising: 
25 a nucleotide sequence which is substantially identical to a sequence of at least 50 

nucleotides of a Vibrio marinus nucleotide sequence selected from the group consisting of ORF 
6 (SEQ ID NO:77), ORF 7 (SEQ ID NO:78), ORF 8 (SEQ ID NO:79), and ORF 9 (SEQ ID 
NO:80), as shown in Figure 6. 

30 8. A recombinant microbial cell comprising at least one copy of an isolated nucleic acid 
according to Claim 6. 

9, The recombinant microbial cell according to Claim 8, wherein said cell comprises each 
element of a polyketide-like synthesis system required to produce a long chain polyunsaturated 
35 fatty acid. 



wo 00/42195 



PCT/USOO/00956 



39 

10. The recombinant microbial cell according to Claim 9, wherein said cell is a eukaryotic 
cell. 

11. The recombinant microbial cell according to Claim 1 0, wherein said eukaryotic cell is a 
5 fungal cell, an algae cell or an animal cell. 

12. The recombinant microbial cell according to Claim 1 1, wherein said fungal cell is a yeast 
cell and said algae cell is a marine algae cell. 

10 13. The recombinant microbial cell according to Claim 8, wherein said cell is a prokaryotic 
cell. 

14. The recombinant microbial cell according to Claim 13, wherein said cell is a bacterial 
cell or a cyanobacterial cell. 

15 

15. A recombinant cell according to Claim 14, wherein said bacterial cell is a lactobacillus 
cell. 

16. The microbial cell according to Claim 8, wherein said recombinant microbial cell is 

20 enriched for 22:6 fatty acids as compared to a non-recombinant microbial cell which is devoid of 
said isolated nucleic acid. 

1 7. A method for production of docosahexenoic acid in a microbial cell culture, said method 
comprising: 

25 growing a microbial cell culture having a plurality of microbial cells, wherein said 

microbial cells or ancestors of said microbial cells were transformed with a vector comprising 
one or more nucleic acids having a nucleotide sequence which encodes a polypeptide of a 
polyketide synthesizing system, wherein said one or more nucleic acids are operably linked to a 
promoter, under conditions whereby said one or more nucleic acids are expressed and 

30 docosahexenoic acid is produced in said microbial cell culture. 

1 8. A method for production of a long chain polyunsaturated fatty acid in a plant cell, said 
method comprising: 

growing a plant having a plurality of plant cells, wherein said plant cells or ancestors of 
35 said plant cells were transformed with a vector comprising one or more nucleic acids having a 
nucleotide sequence which encodes one or more polypeptides of a polyketide synthesizing 
system which produces a long chain polyunsaturated fatty acid, wherein each of said nucleic 
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acids are operably linked to a promoter functional in a plant cell, under conditions whereby said 
polypeptides are expressed and a long chain polyunsaturated fatty acid is produced in said plant 
cells. 

5 1 9. The method according to Claim 1 7 or Claim 1 8 wherein said nucleotide sequence is 
shown in a SEQ ID NO selected from the group consisting of SEQ ID NOS: 69, 71 and 76. 

20. The method according to Claim 1 8, wherein said long chain polyunsaturated fatty acid 
produced in said plant cells is a 20:5 and 22:6 fatty acid. 

10 

21. The method according to Claim 17, wherein said nucleotide sequence is selected from 
the group consisting of Vibrio marinus ORF 6 (SEQ ID NO:77), ORF 7 (SEQ ID NO:78), ORF 
8 (SEQ ID NO:79), and ORF 9 (SEQ ID NO:80), as shown in Figure 6 and Shewanella 
putrefaciens ORF 6 (SEQ ID NO:83), ORF 7 (SEQ ID NO:84), ORF 8 (SEQ ID NO:85), ORF 9 

1 5 (SEQ ID NO:86), and ORF 3, which is complementary to SEQ ID NO:4, as shown in Figure 4. 

22. The method according to Claim 1 8, wherein said nucleic acid constructs are derived from 
two or more polyketide synthesizing systems. 

20 23 . The method according to Claim 1 8, wherein said long chain polyunsaturated fatty acid is 
eicosapentenoic acid. 

24. The method according to Claim 18, wherein said long chain polyunsaturated fatty acid is 
docosahexenoic acid. 

25 

25 . A recombinant plant cell comprising: 

one or more nucleic acids having a nucleotide sequence which encodes one or more 
polypeptides of a polyketide synthesizing system which produces a long chain polyunsaturated 
fatty acid, wherein each of said nucleic acids are operably linked to a promoter functional in said 
30 plant cell. 

26. The recombinant plant cell according to Claim 25, wherein said nucleotide sequence is 
shown in a SEQ ID NO selected from the group consisting of SEQ ID NOS: 69, 71 and 76. 

35 27. The recombinant plant cell according to Claim 26, wherein said recombinant plant cell is 
a recombinant seed cell. 
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28. The recombinant plant cell according to Claim 27, wherein said recombinant seed cell is 
a recombinant embryo cell. 

29. The recombinant plant cell according to Claim 26, wherein said recombinant plant cell is 
from a plant selected from the group consisting of Brassica, soybean, safflower, and sunflower. 

30. A plant oil produced by a recombinant plant cell according to Claim 26. 

31. The plant oil according to Claim 30, wherein said plant oil comprises eicosapentenoic 
acid. 

32. The plant oil according to Claim 30, wherein said plant oil comprises docosahexenoic 
acid. 

33. The plant oil according to Claim 30, wherein said plant oil is encapsulated. 

34. A dietary supplement comprising a plant oil according to Claim 30. 

35. A recombinant E, coli cell comprising: 

one or more nucleic acids having a nucleotide sequence which encodes one or more 
polypeptides of a polyketide synthesizing system which produces a long chain polyunsaturated 
fatty acid, wherein each of said nucleic acids are operably linked to a promoter function in said 
E, coli cell. 

36. The recombinant E, coli cell according to Claim 35, wherein said long chain 
polyunsaturated fatty acid is docosahexenoic acid. 

37. The recombinant E, coli cell according to Claim 35, wherein said nucleotide sequence is 
shown in a SEQ ID NO selected from the group consisting of SEQ ID NOS: 69, 71 and 76. 

38. A plant oil produced by a recombinant plant cell wherein said plant oil comprises a long 
chain polyunsaturated fatty acid exogenous to said plant oil, wherein said plant cell is produced 
according to a method comprising: 

transforming said plant cell or an ancestor of said plant cell with a vector comprising one 
or more polypeptide of a polyketide synthesizing system which produces a long chain 
polyunsaturated fatty acid wherein each of said nucleic acids are operably linked to a promoter 
functional in said plant cell. 
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39. A plant oil according to Claim 38, wherein said long chain polyunsaturated fatty acid is 
eicosapentenoic acid. 

40. A plant oil according to Claim 38, wherein said long chain polyunsaturated fatty acid is 
docosahexenoic acid. 



wo 00/42195 



PCT/USOO/00956 



+ 



1/134 



CO 



1 



00 



J 



o 

CO 

a 

< - 



o 



wo 00/42195 



PCT/USOO/00956 



2/134 




wo 00/42195 



PCT/USOO/00956 



3/134 



Orf6 8.3 KB - 293 kD 



DXAC*GFGG GHS*XG 
I 



4- 



LGXDS*(L/I) 



GXGXX(GAP) 



KAS AT fixACP 

Acetate-like FIG. 2A 



KR 



Orf7 2.3 KB - 84 kD 



GXS*XG 
T=i= 



AT •(TE?) 

HgIC(C-l/2) 

FIG. 2B 

OrfS 6.0 KB - 217 kD 



Orf3 0.8 KB - 30 kD 



Het I- pantetheine transferase 



FIG. 2E 



DXAC* GFGG 



KAS 



CLF 



HgID HgIC(N-l/2) 

FIG. 2C 

Orf9 1.6 KB - 59 kD 



OH 

FabA 



OH 

Faba/(Z) 



Anabeana - Orf352 homolog 

FIG. 2D 



wo 00/42195 



4/134 



PCT/USOO/00956 




FIG.2F 



wo 00/42195 



PCTAJSOO/00956 



5/134 




1 2 3 4 5 6 

FIG. 3 



wo 00/42195 



PCT/USOO/00956 



6/134 



o 



o 



O 
CO 



O 



o o o 

O CN 
ro ^ 



o o o o o 
00 rr o r<i 
^ in ^ ^ 



o 

GO 



o 

CO 



o 
o 



o 



o 

CM 
O 




6 



wo 00/42195 



PCT/USOO/00956 



7/134 




wo 00/42195 



PCT/USOO/00956 



8/134 



o o o 

O <N 

H H 

CN <N <^ 



O 
00 



o o 

rr o 
n ^ 

CM 



o 



o o o 

(N 00 ^ 

ir» in vo 

CM CM <N 



o o o 

O 

r- CO 

CJ M 



o 

00 



o 



o 
o 
o 
m 



o 
o 




I 

d 

1^ 



wo 00/42195 



PCT/USOO/00956 



9/134 



o o o o 

(^, 00 ^ O 

r-t H ro 

m ro 




I 

O 



wo 00/42195 



PCT/LISOO/00956 



10/134 



o o o o 

^ O V0 CM 

H C^ fO 

Tj* ^ ^ ^ 



o o 

00 ^ 



o o o o 

Q VD (N 00 

if\ in ys> ^ 

^ rj* ^ 



O 



o 
o 

CO 



o o o 

VO (N 00 

CO (T\ c\ 

Tj* ^ 



o 

o 
in 



o 
o 



in 




in 
I 



wo 00/42195 



PCT/USOO/00956 



11/134 



o o o 

\0 CA <^ 

H Ol 04 

in U1 



o 
in 



o o o 

O VD <N 

^ ^ in 

in in tn 



o o o o o 

CO ^ O vx) 

in vo CO 

in in LO in tn 



o o 

CO ^ 

00 <Tv 

in in 



o o 
o <o 
o o 



o 




1 

d 



wo 00/42195 



PCT/USOO/00956 



12/134 



o 


o 


O 


O 


a> 




O 




H 


CM 




m 




V£> 


vo 





U H 



o 

04 
U) 

O 

u 
o 

a 
a 

u 

H 
O 

o 
o 



U 



o 

00 

O 
U 
< 

< 



Eh o 

H Eh 



o u 



o 



U 
O 
< 
U 



o 
in 

U 

o 



O O 
O O 

i 

O 
H 
O 



6 

O 

U 

o 
o y 



U 
U 



u 

H 

U 
< 
O 
U 
U 

< 
Eh 
O 
O 
U 
O 
U 
U 

o 




o 


o 


o 




00 




O 


o 


H 








o 






Eh 




< 


r \ 




r i 

w 


o 


H 


en 


o 


O 


o 




O 


o 




u 


£h 




o 






o 






o 


a 




o 


a 




o 


o 


Eh 


o 




o 






Eh 


O 




o 


O 




a 


U 




u 






Eh 




o 


Eh 







u y 

i ^ 

£h U 

a u 

^ ^ 

a o 

u eg 

o rf: 

< o 

a o 

o o 

o o 

O CD 

u u 

Eh < 

o o 

o o 

O Eh 



I 

< 



wo 00/42195 



PCT/USOO/00956 



13/134 



o 
o 



i 

u 

< 
O 
U 
O 

< 

u 
u 

i 

o 

CD 
O 
O 



u 
u • 
o 
o 

o- 

H 

U 

O 
U 

o 
u 

H 

O 



V) 

o 
u 

E- 

o 

H 



o 


o 


o 


vo 




00 






ro 










O 










U 


o 




< 


< 






u 






u 


rn 






r 1 












U 


a 




u 


u 






< 


r I 






CJ 




< 


S2 


u 












i 


< 


AC. 


TT 


CJ 


o 


rf! 




o 


O 


o 




U 


CD 


O 


O 


C!) 






U 






^ 


u 


o 


u 


u 


u 


CJ 


o 


o 




o 


C!) 




u 


H 


tr* 












rn 






















5 








o 




< 








o 






o 




CJ 










O 


o 




a 






u 


o 


^ 




u 


u 


O 


< 


o 


Eh 




a 


O 


< 


o 


o 








u 


E-t 


s 


<: 




o. 




u 


u 


u 










^ 












u 


U 




o 





o 

CD 
Eh 
U 
O 



H 

E- 
U 

I 

u 

CD 

a 



E-^ 
CJ 

u 
o 

CD 



a 



CJ 
Eh 

E- 

Eh 

o 

CD 



o 
o 
in 

Eh 



U CD 
U 
U 
CD 



CD CJ 



U 
CD 
U 
CD 
CD 
U 
CD 



U CD 
CJ O 

CD a 



O CD 

o u 

E-» H 



CD 

&H 

Eh 

1 



Eh U 

U CD 



Eh 

U 

\ 

CD 

U 
CD 

% 

u 
o 

. a 

u a 

CD CJ 



o 

00 
VP 

< 
u 

CD 
U 
CD 
CD 

u 



o 


o 


o 


o 


o 




o 




CN 


00 




00 


00 




Q\ 








r- 


X> 


o 




U 


Eh 


o 




I 


H 


U 


o 


< 




U 






U 










Eh 


a 






-GT 


CD 


u 


1 






£h 




I 


% 


<a: 


CD 


<: 


3 


u 


CD 




CD 






< 


CD 


O 




o 

1-4 

GO 

u 
a 

CD 
CJ 

CD 

tH 

a 
o 

CD 
CD 
CD 
CD 
El 



a 

6h 

CD 

a 

CD 

u 

CD 
Eh 
CD 

% 

a 

U 
Eh 
O 

H 

t 

< 
U 
CD 
CD 
< 



00 

I 

< 

c5 



wo 00/42195 



PCT/USOO/00956 



14/134 




o o 

CO 00 



U 
U 
U 
U 



o 



O 

U 
O 



U O 

u u 

O U 

a 

H 
O 
O 
U 



O 
U 



o o 

<X> 00 

H O 
O O 
O H 

O 
H 

U 
O 
O 

s r: 

o u 



V 

o 
u 



52 



i 

u 



o 

H 

Eh 



u 
5 



o 


O 


o 


00 




o 


00 




o 


00 


€0 


cn 


E- 


o 


< 


< 


O 


u 




r 1 
<^ 


n 


< 


£h 


U 


u 


O 


a 


E-4 


Eh 


u 


o 




o 


< 


£h 




o 


U 






o 






< 










u 








(J 




s 










S 








r ) 














CH 










u 






o 


u 






8- 






O 


U 








i 










£h 






H 








< 


o 




H 


u 




O 


o 


O 


O 




U 


O 




o 


Eh 


1 


o 






o 










u 




u 


a 


Eh 



0 o 

01 00 



H 
Eh 




Eh 
U 

6 

O 
O 

Eh 

< 
Eh 
Eh 



o a 



U 
£h 

Eh 
Eh 
tH 

£h 



Eh 

a 

Eh 



o C2 
u u 

1 1 

Eh O 

o o 

a o 

u o 
o < 

u 

Eh 
Eh 

a 

Eh 



u 

Eh 

u 

a 

Eh 

I 

Eh 



a 
< 
o 
o 
u 

t 

Eh 

CD 
H 

Eh 
U 

H 
E- 
O 



ON 
I 



wo 00/42195 



PCT/USOO/00956 



15/134 



o o 
o 



o o 



O O O 

00 ^ o 

^ \r\ <D 

a\ c\ 



o o 

VX> CN 



o o o o 

00 Tf O Vi) 

CO CTV <TV 

a\ a\ <j\ (y^ 



o 

o 
o 



o 

GO 

o 
o 



o 



o 

H 



o 
o 

CN 

o 




I 

rr 

o 



wo 00/42195 



PCT/USOO/00956 



16/134 



o 


o 




o 




(N 


00 




CM 


ro 


ro 




o 


o 


O 


o 


H 




H 




o 


o 


o 












rn 

UJ 






O 


u 


< 








u 


ti 




o 


< 




O 


o 


o 






H 


o 




O 




u 




d; 




< 






< 






u 


o 


a 


< 

ID 




o 


o 










O 




Eh 


O 


o 


O 






u 


u 


u 




o 






O 


< 






O 










u 


u 


< 




< 


o 


o 


O 








O 




o 




o 


U 


o 










a 


1 


u 


o 


o 


o 


u 


< 








o 


^- 




o 


< 



H 
H 



H 
H 

U 
O 
O 



H 



a 



U 

u 
o 
o 
< 



o 

H 

O 
O 



u 

i 




o 
o 

H 



o o 



a 



o 
o 

O 

H 
H 

U 

U 

u 

I 

E^ 

U 

a 
< 

i 



I 

< 



o 
u 



H 
O 
£-* 
O 

u 

a 
u 

E^ 

o o 
u eg 



o 
o 



u 
a 

E- 
U 

a 
u 

E^ 



Eh 

E^ 

i 

Eh 
O 
O 

CD 

o 



U Eh 



E- 



u u 

Eh " 

u 



C!) O 

< H 
Eh E- 
O O 
Eh 
O 
E- 
O 
< 



o 



O CD 

H a 

CD E^ 

U £h 

CD CD 



£h 
O 
Eh 

U CD 
CD U 
CD CD 
Eh O 
U CJ 



I 

6 



wo 00/42195 



PCT/USOO/00956 



17/134 



o 


o 


o 


00 




o 


04 


m 




H 


H 




H 






u 


"o 


O 




H 




t 


O 








o 




e 


u 


O 




u 






^ 






o 


< 


u 





U O 
H 



o 



u 



U 
< 

u 



i 

u 



O 



o 
u 

U 
< 



a 

H 

O 
U 



o 
^* 

o 

u 



o o 
u o 

U H 

u o 
o a 

E-t 
< 

a 

o o 

6 ^ 



o 



o 

CM 

to 



o 

CO 



o 



o 
o 



o o 

U) CM 



o 

CD 
00 



o 



a 
o 
o 

CM 




o 


o 




CM 


O 


H 




CM 




r-t 




id? 






r J 


C*l 




O 






r 1 








u 




o 


< 




r . 




o 


o 




o 








< 


O 


o 


o 


H 


a 




o 


O 


o 




CJ 




o 


U 


o 




E-; 






< 








O 






r \ 


U 


r . 

C-* 










U 






rn 










a 


r V 






r \ 






rn 


C 


f 1 






U 




O 


O 






o 




u 


u 


o 


rf: 


o 


rf: 




o 






a 






a 




U 

















I 



wo 00/42195 



PCT/USOO/00956 



18/134 



o 
o 



o 



O O 
CM 00 

C4 



o o 

^ o 

in ^ 

OJ CM 

H H 



O O 
CM <N 



O O O O o 

00 O ^ ^ 

00 a\ o\ o 

M <N <N CM ^2 

H H H H 



O 
00 
O 



o 
H 



o o 
o ^ 

CM CM 




I 

< 

d 



wo 00/42195 



PCT/USOO/00956 



19/134 



o 


o 


o 


o 




00 




o 




m 




if) 




m 


ro 


m 


H 




H 





o 

(N 

m 
H 

U 
U 

6 

U 

o 
u 

o < 
o u 

H H 

U U 

O H 

U O 

a O 

O O 

^ o 

< ^ 

a o 

a o 
o o 

u o 
u < 
ft o 

a H 

< u 
u o 

o o 

< y 

o < 

O Eh 

<: < 

U H 

o o 

H 
U 

O U 

< o 




I 

< 

d 



wo 00/42195 



PCT/USOO/00956 



20/134 



o 



O 
O 

i 

CD 
O 

u 
u 
< 



a a 



o 


o 


o 


o 










m 














r 1 






*t 
"1 








U 


U 




O 


< 






U 








o 










O 


U 


o 


O 


H 


o 


< 


O 




u 


U 


<«: 


u 


o 


u 


u 


u 


< 


H 




o 






u 


O 




u 


a 




< 






a 


u 




< 





a o 



o 

CO 

cn 



o 



o o 



o 
o 



U O 



o o o 

Vi) CN GO 

00 00 

tJ* ^ ^ 




o 


o 


a 


o 


o 


o 




o 


u> 




00 






o 


o 




H 


CN 




ir> 


in 


in 


in 


in 


H 


H 


H 


H 


cH 


H 




u 










H 


< 








u 


U 




CD 













f 1 


r \ 


rn 






In 




r 1 


ro 


S 


rn 






r ) 


H 


2 


o 


O 












t-» 




0 




9 






2 














(J 




u 


o 


£-1 


CJ 


o 


< 


H 


u 




O 


H 




< 


O 






U 






U 






U 




< 


o 




U 




O 


o 




o 


u 




H 


o 






< 






o 




O 


u 




E- 


o 


u 




o 


H 










^ 






o 




u 


o 


H 


o 


o 




o 


u 








O 






< 






< 






Eh 


5 


u 






o 


O 


U 




U 


< 






O 




o 


U 










U 




o 


Eh 



a 
u 

I 

o 
u 
o 



Eh 

l: 

Eh 

O 



o 
u 

Eh 

< 

Eh 

U 



fH 

o 
o 
u 

3 



o 
o 
ro 
in 



a 
o 

V 

o 
u 
o 
o 



a o 



IT) 

d 



wo 00/42195 



PCT/USOO/00956 



21/134 




o 


o 




to 




00 


C\ 






in 


vo 


I A 
VJJ 


H 




rH 




fH 


CH 




C!) 


r 1 
\J 


CJ 


O 




H 


H 


CD 


o 




r . 
CH 


o 


Eh 


CJ 


o 




















U 


u 

t 


< 


< 


U 


Eh 








Eh 


o 


< 




a 


Eh 


3 


rf: 


U 


u 


u 




o 


a 




Eh 






o 




< 


o 




























o 


< 


u 


o 




F 


u 


a 






o 






o 














£h 


r 1 


:^ 




o 


Eh 


Eh 


u 


£h 




o 






U 






O 






O 


C-J 


u 


Eh 




o 


U 


U 


o 


o 


C!) 


f ) 




Eh 


o 


U 


C!) 




o 


£h 






0 




u 


Eh 


s 




a 


u 




< 


u 




u 


< 


Eh 


a 


o 


o 


Eh 


Eh 


H 


CO 




Eh 




H 


Eh 








CT 


% 


CO 


U 


u 




u 


< 




o 




0 



o 

H 



O 
O 



o 



O 
CJ 

m 



O CO 



CO 



CD O 



I 



wo 00/42195 



PCTAJSOO/00956 



22/134 




I 

< 

o 



wo 00/42195 



PCTAJSOO/00956 



23/134 



o 


o 


o 


o 


o 




(N 


GO 






in 


in 










1-4 






H 



u 
u 

o 
u 

a 
o 
u 



u 



u 
o 
o 
< 

u 

u 
< 

U 

a 



O 

u 



CD 

u 
o 



a 
u u 

^ 3 



a 
u 
o 



u 

H 

O 

O 
H 
O 

u 
u 
o 



u 

Eh 

o 



u u 



o o 

nj* o 



o o o o o 

VX) (N CO ^ ^ 

CD 00 <ri o 
CO 



u 




o o 

O U) 

ro ro 

00 00 

H H 



a 

o 
o 

O 
U 

U 

O 
U 

a 
u 



o 
o 
•u 

a 
< 



o 
a 
u 
o 

Eh 
O 
O 



< H 00 

OH. 

O O 
U £-! 



< 

a 

£h 

o 

H 
Eh 



H 

U 

U 

o 
C3 



u u 



u a 

Eh " 

o 

O Eh 



o 
u 



o 

H 
U 
H 
U 



< 
d 

(£4 



wo 00/42195 



PCT/USOO/00956 



24/134 



o 


o 


o 


o 


O 


O 




00 




o 


VD 






rl< 


in 


\D 


vo 




00 


00 


00 


CO 


GO 


00 


H 


iH 


H 


fH 


i-H 





D 



u 

o 
a 

a 
u 

U 



o 
a 
< 

O 

o 
o 
o 
u 



O C!) 



o 

U 

Eh ^ 

U O 
O 



ID 

o u 




ON 
I 



wo 00/42195 



PCT/USOO/00956 



25/134 




t 

< 
O 



wo 00/42195 



PCT/USOO/00956 



26/134 




o 

CM 

H 

fv| 

O 

u 
a 

S2 

O 

H 
O 
H 

U 

CD 

U 

o 
o 

u 
o 
a 

o 

u 

% 

o 

H 

O 

O 
U 



I 

< 

d 
E 



wo 00/42195 



PCT/USOO/00956 



27/134 



o 

00 
H 



O 

in 

H 



u u 



U 

a 

CD 
H 



u 
o 
o 

o 



a o 



o 

O 

o 



o 
o 



O 
U 
O 

o 

H 
O 

u 
o 

< 
o 
o 

u o 
< 6 



U 



o 

E- 

U 

u 



H 

a 
o 
u 

£- 
U 
O 

o 
u 

u 

H 
< 



O 



U 



H 

a 



o 



o 

u 
u 
o 
u 



u 

< 
o 

Eh 

U 



o 
o 

U 
O 
U 

o 
o 
o 



u o 



u 
o 
u 

a 

o 
o 
o 
o 
o 




I 

< 

d 



wo 00/42195 



PCT/USOO/00956 



28/134 




o 

CO 
M 

O 

a 

u 
o 

o 
u 
o 
o 



O 
u 
o 

I 

g 

u 
u 

o 



a 
o 
o 
a 
< 

Eh 



u 
o 

a 

CD 

o 
u 
a 

E- 
U 
< 



I 

d 



wo 00/42195 



PCT/USOO/00956 



29/134 




o o 



I 

d 



wo 00/42195 



PCT/USOO/00956 



30/134 




I/) 
I 

d 



wo 00/42195 



PCT/USOO/00956 



31/134 




o 


o 


o 


o 


o 


CO 




o 




CN 


(N 








LO 




V£) 


<o 




V£) 












< 


O 




u 




O 


U 


u 


o 




u 


H 




H 


o 






Eh 








CD 


u 


B 




U 


H 


Eh 




u 


a 


a 


O 


a 


u 








o 


o 






u 


u 








£h 




I 

O 



wo 00/42195 



PCT/USOO/00956 



31/134 




I 



wo 00/42195 



PCTAJSOO/00956 



32/134 




wo 00/42195 



PCT/USOO/00956 



33/134 




o 

00 

00 
(N 

o 
u 

U 
O 
H 

U 
< 

O 

a 
o 
o 

^ 

u 
o 

U 
U 
<< 

u 

Eh 
O 
O 
U 
< 

o 

Eh 
Eh 
< 

a 
u 

u 
< 

Eh 
U 



Eh 
O 

o 
o 
u 



o 


Q 


o 






\o 


>r 


LO 


IjO 


vJU 


m 
\*j 


00 


rvl 
IN 


IN 


(N 


r 1 




to 




rh 

C 


r J 


s 




Ph 








O 






o 


rr\ 




o 




fcH 


o 






o 


u 




<c 


Eh 






U 


rti 


u 


U 




o 


Eh 






rf; 






O 


Eh 


u 


Eh 


o 


o 






















O 


U 




o 




o 






u 










CD 






O 


O 


u 


Eh 


o 


&H 


Eh 




<: 


O 




o 


H 






Eh 


< 










Eh 


Eh 




O 




< 


H 


o 




o 


Eh 


< 


< 


a 


o 




< 


u 




o 


< 




Eh 



Eh 
H 

U U 



u 
u 
u 



o u 
a o 



CD 
O 
Eh 
U 
Eh 
H 
U 

H 
H 

U 

U 
CD 
H 
U 
U 



Eh 
U 
Eh 

CJ 

CD 
CD 
Eh 
C^ 
CD 

U 
CD 

U 
O 

o 
u 
o 
o 



I 

< 



wo 00/42195 



PCT/USOO/00956 



34/134 




ON 
I 

d 



wo 00/42195 



PCT/USOO/00956 



35/134 




I 



wo 00/42195 



PCT/USOO/00956 



36/134 




I 

O 



wo 00/42195 



PCT/USOO/00956 



37/134 




I 

d 



wo 00/42195 



PCT/USOO/00956 



38/134 




I 

d 



wo 00/42195 



PCT/USOO/00956 



39/134 



o o o o 

O KD C<1 CO 

(N CN ro r> 

^ ^ ^ ^ 

m m 




3 



u u 



C5 



rr 

I 

d 



wo 00/42195 



PCT/USOO/00956 



40/134 



o o 

00 ^ 

a\ o 

rj* in 

ro ro 

Eh < 

Eh O 

O H 

O O 

< H 

<C < 

U O 

U Eh 

O U 

O <C 

H H 

Eh O 

£h rf: 

U O 

O U 

o 

u o 

Eh O 

U < 

H U 

^ 

H < 

O Eh 

O O 

U U 

O U 



U H 

< O 

< o 
o < 

< ^ 

Bh < 

u u 

o << 

u ^ 

u o 

O Eh 

< < 

Eh O 



o 
o 

H 

in 
ro 

Eh 

o 

H 

Eh 

o 
o 

U 
O 

Eh 



o 

H 

in 
ro 

Eh 

o 

Eh 

o 

Eh 
Eh 



o o 

Eh 

o 
u 

Eh 



U 
< 

Eh U 



U 

o 

u 
o 

CJ 

o 
o 

Eh 

u 
o 

Eh 

U 

o 

O 
O 

o o 



6 

u 
o 

Eh 

o 

U 

O 
£- 

o 
< 

u 
o 

Eh 

u 
< 



u 

< 
a 
u 




I 

< 

d 



wo 00/42195 



PCT/USOO/00956 



41/134 



o o o 

^ O U) 

fN ro m 

U) 

ro m ro 




U O 



u u 



I 

o 



wo 00/42195 



PCTAJSOO/00956 



42/134 



o 


O 


o 


CO 




o 




CD 












ro 





U 
< 

u 



Eh 

u 



O 

o 
u 
o 



3 



i 



u u 



a 
o 

V 



Eh 

u 
e» 
u 
o 
u 



o 

a u 



< 
o 



a 



o o 

o o 
o 
< 

Eh 
U 

u o 

Eh P 

O < 

H < 

U < 

U U 

o 



U 
U 

u 

CD 



I 

u 
u 

o 

E-* 
Eh 
O 
Eh 

o 
a 

6h 

O 

Eh 

u 
o 

U 

o 

Eh 
£h 

u 
o 
< 



o 

<T\ 
ro 

O 
U 
O 

tH 

H 
Eh 

H 
H 

O 



o 

Eh 

u 
o 

u 
u 
o 
< 

o 

E-t 

Eh 
U 

o 
a 

Eh 

o 
o 

o 
o 
o 

o o 

Eh O 



u 



H Eh 

u a 



o 
CO 
ro 

ro 



Eh 

u 

O 



o 

ro 



Q a 



o 
o 

Eh 
Eh 



O 
O 

u a 

Eh Eh 

a H 

2 Eh 



£h 




Eh 






U 




^ 


a 




£h 


6 




O 






O 


O 




O 


Eh 


a 


Eh 


Eh 


o 




Eh 




^ 


O 


< 




o 


Eh 


Eh 




O 


E-t 


H 




Eh 


Eh 


O 




u 


Eh 


CD 




o 




o 


E-« 




o 


O 


Eh 


U 


Eh 


O 






CD 










1 


a 


















Eh 








Eh 






Eh 






H 


o 


< 


O 


H 




U 


O 














Eh 










o 


o 


U 




a 


% 




u 




i 




o 









o o 
ro ro 



Eh 
< 

U 
Eh 

o 

Eh 
Eh 

a 
o 
o 

Eh 

o 

u 
o 

u 
u 



Eh 

H 
U 

o 

CD 

% 

CD 

Eh 

o 

Eh 
Eh 



CD 



CD 

a 

CD CD 
U 
E-* 

CD U 

< ^ 
CD < 

Eh Eh 
U U 

CD 
Eh 
Eh 

U 

CD 
H 

U 

8 

Eh 
U 

CD U 



i 

u 

I 



o o 
ro m 



O 
O 

i 

Eh 



*Eh 
Eh 
CD 

I 

5 



CD O 

Eh < 

^ tH 

£h ici; 

U CD 

< u 

U Eh 

Eh <; 

<< H 

U H 

rf! < 
CD 

f:t U 

CD H 

Eh ri; 

Eh Eh 

< ri; 

H Eh 

a Eh 

<C Eh 

2 Eh 

< < 
CJ H 
H Eh 
Eh Eh 

< < 
Eh O 

< CD 

£^ y 

H < 

U U 

CD U 
a Eh 

Eh rf; 

Eh Eh 

H CD 

a < 

CD U 
Eh ^« 



I 

O 



wo 00/42195 



43 / 134 



PCT/USOO/00956 



Q 


o 


LTI 


Q 








00 


00 
















(J 




U 


O 




o 












r . 












r \ 






f . 

H 






O 












o 






r J 


(J 




o 






u 








H 




1 














U 




O 






u 








u 






u 




u 












< 


u 


O 




o 


H 


u 




H 


< 




H 


o 


O 









00 

I 

d 



wo 00/42195 



PCT/USOO/00956 



44/134 



6121 








* 

MKQTLMAISI 


MSLFSFNALA 


AQrlriHUnl i V 


nvpr'TTA ATEH 


TIAHNQAVAK 


TLNFADTRAF 


EQSSKNLVAK 




AEFAFISDEI 


PDSVNPSLYR 




V T(r\/ Q nr; T YO V 

X IS. V OJJnJJ. i V 


RGTDLSNLTL 


IRSDNGWIAY 


DVLLTKEAAK 




PKDGDPWAM 


lYSHSHADHF 


GGARGVQEMF 


PDVKVxvaoJJJN 


ITKEIVDENV 


LAGNAMSRRA 


AYQYGATLGK 


HDHG 1 V UAAIj 


GKGLSKGEIT 


YVAPDYTLNS 


EGKWETLTID 


GLEMVFMDAb 


GTEAESEMIT 


YIPSKKALWT 


AELTYQGMHN 


I YTLRG AK V R 


DALKWSKDIN 


EMINAFGQDV 


EVLFASHSAP 


VwGNQAINDr 


LRLQRDNYGL 


VHNQTLRLAN 


DGVGIQDIGD 


AIQD J. 1 IrJio 1 


YKTWHTNGYH 


GTYSHNAKAV 


YNKYLGxrU 




TKQESAKFVE 


YMGGADAAIK 


RAKDDYAQGE 


YRFVATALNK 


VVMAEPENDS 


ARQLLADTYE 


QLGYQAEGAG 


WRNIYLTGAQ 


ELRVGIQAGA 


PKTASADVIS 


EMDMPTLFDF 


LAVKIDSQQA 


AKHGLVKMNV 


ITPDTKDILY 


lELSNGNLSN 


AWDKEQAAD 


ANLMVNKADV 


NRILLGQVTL 


KALLASGDAK 


LTGDKTAFSK 


IADS MVE FTP 


DFEIVPTPVK 
8103 







FIG. 4B 



wo 00/42195 



PCT/USOO/00956 



45/134 

8186 

STKASARWA KFNVEEAAIS IQQCQGISLA FRYSDDLHGL 

LCHWNDAANM QQEKAEILGL GSKQPEANPK NSSSELLALG 

IDQKLLVQRQ NLQHEVKHDA lADSIDVCHS LSKPANVGLF 

TESLASFDFA FSKLSLALGL GKAKIYSEKL AWLDFFRDRQ 

LAEPLALLAR KESESFYHSL ISHINTSNRC REIDVGFEIS 

ASDTEEKSAQ SAGKNDATCI GVLLWDGSHS VNFHVGTQAF 

QADSLRPKGK DGYEFRWENP RIESHQSLLA RLYGRVM 
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aCTAGTCTTA GCTGASRTHR YSAASRAGCT CGAACAACAG CTTTAAAATT 
CACTTCTTCT GCTGCAATAC TTATTTGCTG ACACTGACCA ATACTCAGTG 
CAAAACGATA ACTATCATCA AGATGGAAAR GVAVAAAYSH ASNVAGGAAA 
ASRGNGNCYS GNGYSRAAHA RGTYRSRASA SHSCCCAGTA AACAATGCCA 
ATTATCAGCA GCGTTCATTT GCTGTTCTTT AGCCTCAATC AAACCTAAAC 
CAGACTTTTG TGGCTCAGCG TTAGGCTTAT TAGGYCYSHS TRASNASAAA 
AASNMTGNGN GYSAAGGYGY SRYSGNRGAA ASNRYSASNS RAACTCGACT 
CTAGTAAAGC AAGACCAATA TCTTGTTTTA ACAAAACCTG TCGCTGATTA 
AGTTGATGCT CAACCTTGTG ATCCGCAATA GCATCGGAAA TSRSRGAAGY 
ASGNYSVAGN ARGGNASNGN HSGVAYSHSA SAAAAASSRA TCAACACAAT 
GGCTCAAGCT TTTAGGTGCA TTAACTCCAA GAAAAGITTC GCTCAGTGCA 
GAGAAGTCAA ACGCAAAAGA TTTTAGCGAT AATGCCAGCA SVACYSHSSR 
SRYSRAAASN VAGYHTHRGS RAASRHASHA AHSRYSSRAA CCAAGTCCTT 
TCGCTTTAAT GTAAGACTCC TTGAGCGCCC ACAAATCAAA AAAGCGGTCT 
CGCTGCAAGG CCTCTGGTAA CGCTAACAAG GCTCGCTTTT GYGYYSAAYS 
TYRSRGYSAA TRASHHARGA SARGGNAAGR AAAAARGYSG CTGATTCAGA 
GAAATAATGA CTAAGAATAG AGTGGATATT GGTGCTGTTA CGGCAACGCT 
CAATGTCGAC GCCAAACTCA ATACTAGCAG AGTCAGTTTC SRGSRHTYRH 
SSRSRHSASN THRSRASNAR GCYSARGGAS VAGYHGSRAA SRASTHRGCT 
CCTTGCTTGC CTGACTGGCG CCTTTATTAT CAGCAGTGCA AATGCCTACT 
AATAGCCAAT CTCCACTATG ACTCACATTA AAGTGGACCC CGGTTTGAGY 
SSRAAGNSRA AGYYSASNAS AATHRCYSGY VATRASGYSR HSSRVAASNH 
HSVAGYTHRG NGCAAATTGC GCATCACTCA ATCTAGGCTT ACCTTTGTCG 
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CCATATTCAA AGCGCCATTC ATTGGGGCGT ATTTCACTAT GTTGTGACAA 

TAAAGCGCGC AAAHGNAAAS SRARGRYSGY YSASGYTYRG HARGTRGASN 

RARGGSRHSG NSRAAARGAA TAGCCTCTTA CCATTAAACC TTGAGTTTTA 

GCTTCTTGTT TAATGTAGCG ATTAACCTTA ATTAACTCAT CTTCAGGCAG 

CCATGACTTA ACCAACTCTY RGYARGVAMT GYGNTHRYSA AGGNYSTYRA 

RGASNVAYSG ASGRTRSRYS VAGTGTAGTC TGGTTATCGC ACTCTTGTAT 

TGTTAACGGA CAGAAGTATA AGGAAATCAA 
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ASMFLNSKLS RSVKLAISAG LTASLAMPVF AEETAAEEQI ERVAVTGSRI 

AKAELTQPAP WSLSAEELT KFGNQDLGSV LAELPAIGAT NTIIGNNNSN 

SSAGVSSADL RRLGANRTLV LVNGKRYVAG QPGSAEVDLS TIPTSMISRV 

EIVTGGASAI YGSDAVSGVI NVILKEDFEG FEFNARTSGS TESVGTQEHS 

FDILGGANVA DGRGNVTFYA GYERTKEVMA TDIRQFDAWG TIKNEADGGE 

DDGIPDRLRV PRVYSEMINA TGVINAFGGG IGRSTFDSNG NPIAQQERDG 

TNSFAFGSFP NGCDTCFNTE AYENYIPGVE RINVGSSFNF DFTDNIQFYT 

DFRYVKSDIQ QQFQPSFRFG NININVEDNA FLNDDLRQQM LDAGQTNASF 

AKFFDELGNR SAENKRELFR YVGGFKGGFD ISETIFDYDL YYVYGETNNR 

RKTLNDLIPD NFVAAVDSVI DPDTGLAACR SQVASAQGDD YTDPASVNGS 

DCVAYNPFGM GOASAEARDW VSADVTREDK ITQQVIGGTL GTDSEELFEL 

QGGAIAMWG FEYREETSGS TTDEFTKAGF LTSAATPDSY GEYDVTEYE^ 

EVNIPVLKEL PFAHELSFDG AYRNADYSHA GKTEAWKAGM FYSPLEQLAL 

RGTVGEAVRA PNIAEAFSPR SPGFGRVSDP CDADNINDDP DRVSNCAALG 

IPPGFQANDN VSVDTLSGGN PDLKPETSTS FTGGLVWTPT FADNLSFTVD 

YYDIQIEDAI LSVATQTVAD NCVDSTGGPD TDFCSQVDRN PTTYDIELVR 

SGYLNAAALN TKGIEFQAAY SLDLESFNAP GELRFNLLGN QLLELERLEF 

QNRPDEINDE KGEVGDPELQ FRLGIDYRLD DLSVSWNTRY IDSWTYDVS 

ENGGSPEDLY PGHIGSMTTH DLSATYYINE NFMINGGVRN LFDALPPGYT 

NDALYDLVGR RAFLGIKVMM 
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^SQTSmNS ATEQAQDSQA DSRLMKRLKD MPIAIVGf^^ IFMSRYLHK 
,„DLISEKID AITELPSTHW QPEEYYDADK TAADKSYCKR GGFLPDVDFN 
PHEFGLPPNI LEL^SSQLL SLIVAKEVLA DAM.PE«YDR DKIGITtGVG 
GGQKISHSLT ARLQYPVLKK VFANSGISDT DSEMLIKKFQ DQYVHWEENS 
FPGSLGNVIA GRIANRFDFG GMNCWDAAC AGSI^AMRMA LTELTEGRSE 
MMXTGGVCTD KSPSMYMSFS KTPAFTTNET lOPFDIDSKG MMIGEGIGMV 
ALKRLEDAER DGDRIYSVIK GVGASSDGKF KSIYAPRPSG QAKAI^YD 
DAGFAPirrLG LIEAHGTGTA AGDAAEFAGL CSVFAEGNDT KQHIALGSVK 
SQIGHTKSTA GTAGLIKAAL AUmKVLPPT INVSQPSPKL DIENSPFVL« 
TETRPWLPRV DGTPRRAGIS SFGFGGTNFH FVLEEYNCEH SRTDSEKAKY 
RQROVAQSFL VSASDKASLI NELNVLAASA SQAEFItKDA AANYGVRELD 
KKAPRIGLVA NTAEELAGLI KQALAKLAAS DPHAWQLPGG TSYRAAAVEG 
KVAALFAGQO SQYUmGRDL TCVYPEMROO FVTADKVFAA NDKTPLSQTL 
VPKPVFNKDE LKAQEAILTN TANAQSAIGA ISMGQYDLFT AAGFNADMVA 
GHSFGELSAI. CAAGVISADD YYKIAFARGE AMATKAPAKD GVEADAGAMF 
AIITKSAADL ETVEATIAKF DGVKVANYNA PTQSVIAGPT ATTADAAKAL 
TELGYKAINL PVSGAFHTEL VGHAQAPFAK AIDAAKFTKT SRALYSHATG 
GUYESTAAKI KASFKKHMLO SVRFTSOLEA MYWIGARVFV EFGPKNILQK 
:.VQGTLVNTE NEVCTISfflP «PKVDSDLQL KQAA«QLAVT GWLSEIDPV 
QADIAAPAKK SPMSISLHAA NHISKATRAK MAKSLETGIV TSQIEHVIEE 
KIVEVEKLVE VEKIVEKWE VEKWEVEAP VNSVQAMAIQ TRSWAPVIE 
HQWSKHSKP AVQSISGDAL SNFFAAQQQT AQLHQQFLAI PQQYGETFTT 
LMTEQAKIJ^ SGVAIPESLO RSMEQFHQLQ AQTLQSHTQF LEMQAGSNIA 
AI^LLNSSQA TYAPAIHNEA IQSQWQSQT AVQPVISTQV NHVSEQPTQA 
PAPKAQPAPV TTAVQTAPAQ VWQAAPVQA AIEPIHTSVA TTTPSAFSAE 
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TALSATKVQA TMLEWAEKT GYPTEMLELE MDMEADLGID SIKRVEILGT 
V<JDELPGLPE LSPEDLAECR TLGEIVDYMG SKLPAEGSMN SQLSTGSAAA 
TPAANGLSAE KVQATMMSW AEKTGYPTEM LELEMDMEAD LGIDSIKRVE 
ILGTVQDELP GLPELSPEDL AECRTLGEIV DYMNSKLADG SKLPAEGSMN 
SQLSTSAAAA TPAANGLSAE KVQATMMSW AEKTGYPTEM LELEMDMEAD 
LGIDSIKRVE ILGTVQDELP GLPELNPEDL AECRTLGEIV TYMNSKLADG 
SKLPAEGSMH YQLSTSTAAA TPVANGLSAE KVQATMMSW ADKTGYPTEM 
LELEMDMEAD LGIDSIKRVE ILGTVQDELP GLPELNPEDL AECRTLGEIV 
DYMGSKLPAE GSANTSAAAS LNVSAVAAPQ AAATPVSNGL SAEKVQSTMM 
SWAEKTGYP TEMLELGMDM EADLGIDSIK RVEILGTVQD ELPGLPELNP 
EDLAECRTLG EIVDYMNSKL ADGSKLPAEG SANTSATAAT PAVNGLSADK 
VQATMMSWA EKTGYPTEML ELGMDMEADL GIDSIKRVEI LGTVQDELPG 
LPELNPEDLA ECRTLGEIVS YMNSQLADGS KLSTSAAEGS ADTSAANAAK 
PAAISAEPSV ELPPHSEVAL KKLNAANKLE NCFAADASW INDDGHNAGV 
LAEKLIKQGL KVAWRLPKG QPQSPLSSDV ASFELASSQE SELEASITAV 
lAQIETQVGA IGGFIHLQPE ANTEEQTAVN LDAQSFTHVS NAFLWAKLLQ 
PKLVAGADAR RCEVTVSRID GGFGYLNTDA LKDAELNQAA LAGLTKTLSH 
EWPQVFCRAL DIATDVDATH LADAITSELF DSQAQLPEVG LSLIDGKVNR 
VTLVAAEAAD KTAKAELNST DKILVTGGAK GVTFECALAL ASRSQSHFIL 
AGRSELQALP SWAEGKQTSE LKSAAIAHII STGQKPTPKQ VEAAVWPVQS 
SIEINAALAA FNKVGASAEY VSMDVTDSAA ITAALNGRSN EITGLIHGAG 
VLADKHIQDK TLAELAKVYG TKVNGLKALL AALEPSKIKL LAMFSSAAGF 
YGNIGQSDYA MSNDILNKAA LQFTARNPQA KVMSFNWGPW DGGMVNPALK 

FIG. 4G-2 
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MSLPDNASNH 
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FIG. 41-1 



wo 00/42195 



PCTAJSOO/00956 



55/134 



EEIKARSLVQ 
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MNPTATNEML SPWPWAVTES 
YWNHADHGF GIAQTADIVT 
ESLGDNNFRR VHGVKYAYYA 
ILCGSFGAAG LIPSRVEAAI 
SEPALERGSV ELFLKHKVRT 
LSRDAQGKW VGNKVIAKVS 
VDDGSITAEQ MELAQLVPMA 
LLPTILALKE EIQAKYQYDT 
NMGAAYIVTG SINQACVEAG 
PAADMFEMGV KLQWKRGTL 
IPLDEREKLE KQVFRSSLDE 
EGNPKRKMAL IFRWYLGLSS 
ALGAFNQWAK GSYLDNYQDR 
SLTAQGVKVP AQLLRWKPNQ 



NISFDVQVME QQLKDFSRAC 
EQAANSTDLP VSAFTPALGT 
GAMANGISSE ELVIALGQAG 
NRIQAALPNG PYMFNLIHSP 
VEASAFLGLT PQIVYYRAAG 
RTEVAEKFMM PAPAKMLQKL 
DDITAEADSG GHTDNRPLVT 
PIRVGCGGGV GTPDAALATF 
ASDHTRKLLA TTEMADVTMA 
FPMRANKLYE lYTRYDSIEA 
IWAGTVAHFN ERDPKQIERA 
RWSNSGEVGR EMDYQIWAGP 
IJAVDLAKHLM YGAAYLNRIN 
RMA 
32358 
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32834 

MRKPLQTINY DYAVWDRTYS 

DTFKSLKVDG VFIFNRTNQP 

FKQHPQNIAL SPQTKQAHPP 

RYGPAIYYSS TSILKSDRSG 

SQYTAAGVEI AMADAADAQL 

TNVDGKPLLK LVLYHTNNQP 

LAYFLYSYFL VRPVRKLASD 

ELVKVATHFN ALMGTIQEQT 

FEQRLETYCQ LLARQQIGFT 

GDEALIKVAQ TLSQQFYRAE 

EPLQRKLDAM LHSFAELNLP 

AVDDFEFKSE SHIIGSQAAL 

TTITVDEIEQ LEANKIGHQ 

34327 



YMKSNSASAK RYYEKHEYPD 
VFSKGraHRN DIPLVFELTD 
ASKPLDSPDD VPSTHGVIAT 
SQLGYLVFIR LIDEWFIAEL 
ARLGANTKLN KVTATSERLI 
PPMLDYSIII LLVEMSFLLI 
IKKMDKSREI KKLRYHYPIT 
KQLNEQVFID KLTNIPNRRA 
LIIADVDHFK EYNDTLGHLA 
DICARFGGEE FIMLFRDIPD 
HPNSSTANYV TVSLGVCTW 
lADKALYHAK ACGRNQALSK 
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Ltagatcgactcgcaaaagttgcttaagatagtgtcaatatagcttcttatttgta 

AATATTGTTTTTTATGTGTAAACATGTTTAGTGTGTGTAAATGCTGTTAATTATCCT 

TTTGGGATTGTAATAGCTGATGTTGCTGGCTAATGAGTACTTTTAGTTCGGCAATAT 

CTTGCTTTAAATCGCTAACTTCAGTTTTTAATTCACCCACACTTGTTGTATTTTTAA 

GGCTCTCTTCCCCACCATCGACAAACCAGGATGATATGAAACCGGTAAACGTACCAA 

AGAGACCGACACCTGCAGTCATGAGTAATGCCGCAATGATACGTCCGCCAGTGGTGA 

CGGGGTAGTAGTCACCGTAACCAACAGTCGTTATTGTCACAAATGACCACCAAAGTG 

CGTCGATGCCGTTATTGATGTTACTGCCTACTTGATCCTGTTCTAACAATAAAATAC 

CGATAGCACCAAAGGTGACAAGGATGAAGGATATCGCAGATACCAGCGAAAAGGTGG 

CTTTAAACCGATGTTCAAAAATCATTTTTAAGATAATTTTTGATGAGCGTATATTCT 

GAATAGATCTTAATACTCTAGCGATACGAATTATGCGAATAAACTGCAGTTGCTCGA 

CCATCGGAATACTCGACAGTAGGTCAATCCAACCCCArrTCATAAACTGAAATTTAT 

TCTCAGCTTGGTGAAAGCGAATTACAAAGTCAGTGAAAAAGAATAAGCAAATCGTAT 

TATCTACGCTCGTTAATATTTCAGTGACGTtACTTGAAAAGGTAAAAATAAGTTGCA 

GTAGTGATGATACGACCACATGAAGTGATAAAATAAGCATGAAAATCTGAAATGGAT 

TTACATCACTGTTGTTTTTGGTGCCACTTTTAAGGTTCGTTTTCACAATCTGCTGCC 

TCGGTTCATTGATTTTGTTAATATAAACCTTAGTCAGTAGCAAGACAAAATATATTT 

ACATCAATGTCATCGTATTATTCAACCGCGCGTCGTGTATTCAGACCAAGATCGTTG 

TATATGTTAGTCATGTAGCGATGAGATTATCATGCGACAGGAGAGAATTATGTTTGT 

TATTATTTITTACGTACCTAAAGTTAATGTTGAAGAAGTAAAACAGGCGTTAm 

CGTCGGAGCTGGCACCATCGGTGATTATGATAGTTGTGCTTGGCAATGTITGGGGAC 

TGGGCAGTTCCAACCTTTACTTGGTAGGCAGCCACATATTGGTAAGCTAAATGAGGT 

TGAATTCGTTGATGAGTTTAGAGTAGAAATGGTTTGTCGAGCAGAAAATGTAAGGGC 

AGCAATAAATGCACTTATTGCTGCGCACCCTTATGAAGAACCTGCTTATCATATTCT 

GCAAACATTGAATCTTGATGAGTTACCTTAAGTTAGATGCACTGCACTTAATTGGTT 

CGCTGTGCTAGGTTAGCAATTAGCAATTTTGACCATGTTAGCGATAGTTTTGGCACA 
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AGTGATCGATATTAAACTATCCGATTCAGATCCCATTTTTACTGCTGAATTAGGTTT 
CATTACACTTGTTCTAGTGGTTTTTCCCGACAGGTGTAACTCTGTTACTTGCGTAAG 
GTTGATAATCTCTACCGCATTGGCAGGAGTTACACCTGCACCAGGCATAATACTAAT 
TCTACCATCTGCTTGGTTAACTAACGTTTGGATTAAGGCGCAGCCTTCTAGCGCTTG 
AGCTTGTTGACCAGAGGTTAAAATACGCTCACAACCAGCAGTGATCAAGGTCTCCAA 
GGCTTGTTGTGGATCATTACACAAGTCGAAAGCGCGGTGGAAGGTTACGCCGAGATC 
ACGTGATGCCACCATTAAGCGTTTTAAAGCTGGCTCGTCAATATTACCATCTGCTGT 
TAACGCGCCAATAACGACCCCTTGGACACCGAGTAACTTCATGAATTTGATGTCGGA 
AACCATAATATCAACTTCTTGTTCGCTATATACAAAATCACCGGCGCGAGGGCGAAT 
AATGGCATAAATGGGGATCGTTGCTAGATCAATAGACTTTTGTACAAAACCTGCGTT 
GGCGGTCAAGCCACCTAATGCTAATGCCGAGCACAACTCAATACGATCGGCGCCAGA 
TGCTTGAGCCGTCAGCAGTGATTCTATATTATCGACACATACTTCTATTGTCATTGT 
CATATACTTCTCTTTAAAAAGTTTATTAAAAATAATAAAGCCAGCATAAGTCGTTTT 
ATACAATATGAAAGGGGAAAAGGCGACTTAGCTCGCCTAGATCAATTATTATGGCAG 
AATACTGCCGTATTGTGATTAGAAAGACAGTTTTTTAAGCTCAATAGCCGTTATCGC 
GTTGTTATCTACCATCGTGTAACTTTTCTGGCCTGGGTGCTTTATTAACACTGTTTC 
AGTGGCTGGATTAGGGTGAAATGATTCTTTTTTCAAATCTGTTTTTTTGTATTTGAA 
CGTACCTGTAATGTCTTGCTGCTCACGAAGACGTACAAATATTGGTTGCGCATAGCT 
TGGTAGTGCCGCATTGACATGTTGATAGAATTCAGACGCTGAAAATTCATGAATAGG 
GCAATTCAAAGTCAGCGCGACCATGCCTGCTCGGCCATCGTGATGTGGGAGCTTGAC 
ACCATAAGCCACACTTTGCTCAATTTGCACAAAATCGTTAACTTGAGCTTCTACTTG 
CGTCGTGGCGACATTTTCACCTTTCCAGCGGAATGTATCACCTAATCTATCCACAAA 
GGAAATATGGCGATAACCTTGGTAATGAACGAGATCGCCGGTATTAAAATAACAGTC 
ACCGTCTTTTAATACTGACTTAAATAGCTTTTTATTACTTTCGTTGTCATCGGTATA 
ACCATCAAATGGTGAACGTTTAGTTATCTTTGTTAGCAGTAGCCCTGTTTCTCCCGT 
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TTTTACTTTGGTCATTTTCCCTTTCGCATTATACACAGGTTTGTCATTGTCAATATC 
ATATTGTATGACGGTAAAAGCAAGTGGAGTAACCCCCGCTGTATGCGGTAAGTTCAG 
CGCATTGGAGAACACAAGATTACACTCACTGGCGCCATAGAATTCATTAATATGCTC 
GATCCCAAAACGTTGTTGGAAATGATCCCAAATTTCGGGGCGTAATCCATTACCTAT 
GATTTTCTTTATATTATGCTGTTTGTCTTTATTGCTAGGCGGTACATTTAATAAATA 
ACGGCAGAGCTCGCCGATGTAAGTAAACGCAGTGGCATTATGAGCACGAACTTCATC 
CCAAAAGCGACTTGAACTGAATTTTTCAGAAAGTGCGAGGGTTGCTGCGCTACCAAA 
CACGGCGCTTAATGACACTGTCAGTGCATTGTTATGGTATAGGGGGAGTGATAAATA 
CAATACATCATCAGCTGTTAAGCGTAATGATGCCATCCCCATGCCTGCCATGGATTT 
AAACCAACGGTGATGGCTCATTCTTGCTGCTTTTGGCAGTCCAGTTTTTCCCGAGGT 
AAAGATATAAAACGCGCAATGCTTAAGCTGTATTTGTGCTGTTGATTCAGGGTTCAA 
TACTGAATATCCTGCGACTAGTGTAGATATGTTTTTATAACCATCACTCATGTCTGG 
CGTTTCTAAAGCGGGTACGTAAAAGACATTCTGTTGTAATGTCGATGACAAATTGGT 
TTCAATATTATTAATGGCGGATGTGTATAGTTCATCTGCGATGAGTAATTTGGTATC 
GACCACGCTAAGACTATGTTCGAGGATTGAATCCCGTTGTGTCGTATTTATCATACA 
AGCAATCGCGCCAAGCTTGACAACTGCGAGGGCAATAATGATGGTTTCAGGCCTGTT 
ATCGAGCATGATGGCGACTTTATCATTTTTACCAATGCCGTATTCATGAAGGAAATG 
GGCATATTGATTTGCTTGCTTATTCAATGAATCGTAACTATAACGCTGGTCTTTAAA 
TTGTATTGCGATCAAGTCAGAGTTATTGACAGCTTGCTGCTCTAGTAATAAACCAAT 
AGACATAAAACGTTCGGGCTTTGCTTGTTGTAAGTGCCATAAGCCTTTGATGATTGG 
CTTTGGGGTTTTTAATAGATTGATGGTACTTTTCAGGAATTGTTTGCCGGTTATAAC 
AGTCATAAGCTAATTCTTTTTATCAAGAAGAGGGGTTATGACACCAAATAAATGGGT 
CACGCGTTGGTTTAATTTGGTTAGACTAAATGTGTTGTTTTGCTGTGATAATGCGAC 
GTTCAAACAAACTTGAGAAGGTAAAAAAATAGCATTTTTAAATTGAACATCAATACT 
AATGTGTTGAATATCAATCAAGTTTTCTAACTGTGCGAGCACGCGTGCTTTAGCAAA 
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CATGCCATGTGCTATTGCTGTTTTAAACCCCATTAGTTTCGCTGGGATAAAATGTAA 
ATGGATTGGATTTGTGTCTTTGGAGATATAAGCATATTTATATACGTCAAAAGGACT 
AAATTTAAACAATGAAATCGGCTCGTAAGCATAATTCGCTGGCGTATTTACTATTTT 
CTCACCGCTGGAACGTTGAGATCGTTGGCACGTTTTTCGCTGTTTCGTTTTCTGTAA 
GAATGTCGATGTACACTCCCACGCAAATTGTCCATCTACAAACACATCAATATGAGT 
ATCAATGAAACGTCCTGTATCCGTTATGTACTCCTTAATTACACGACATGTGCTCGT 
CAATATCGCGTTTAATGCTATCGGTTGATGTTGTGTTATGCGATTTCGATAATGGAC 
TAGTCCTAATATAGATATCGGAAATTGTGTTGATGTCATGAGTTTCATCAATAATGG 
AAAGATCATCACAAATGGATAAGTAACCGGTACATAGTTTGTGTTATTAAACCCACA 
GCATTTAATATATTGCTTTAAATTTCGCTGATCTATTTTTTGTCCACTGATACTAAA 
TTGCTCAGTACACACTTGTGTCGACCAAGTGTTCATCAGTGTTTTAACAATTGTATT 
GAeCACTGCTTTCACATATAAAAGCGAGATAATCGGTTGCTTTGTTAACAGTGTGAT 
CTGGTTAGCGTGCATTGAAATAATTCATATAAGAGTATGTAGCATTTATGTTAATAT 
TTTGTTTTGGAAGTTGAATTGGCGAATCCGTAATCGGTTTATGGCAGTTCGGTCAAA 
TACTTCAGGTAAACTCGTTACTCATACCATTGATAGTGTTAAAGTGATTGACTGAAT 
AAAGAATAGAGCTAAAAGTGGAAAAATTATGCAAGATGCGGGTATGTTATTACGCAT 
TGCTTATGAGGCAATGAAAGAGTTAGAGGTTGATGTCATTGAAGTACTTTCTCGTTG 
TAACATAAGTGAAGAAGTACTGAATGATAAGGATCTTCGCACACCTAATCATGCACA 
AACACATTTTTGGCAAGTATTAGAAGACATATCACAAGATCCTAACATCGGCATTTe 
ACTTGGTGAGAGAATGCCAGTGTTCACGGGGCAGGTATTACAGTATCTTTTTCTCAG 
TAGTCCTACATTTGGTACTGGCTGGGAACGCGCAACAAAATACTTTCGATTAATCAG 
TGATGCGGCGAGTGTTTCTATCAAGATGGAAGGCTGTGAAGCGCGATTATCTGTGAA 
CTTAGATGGTTTAGCGGAAGATGCGAATCGTCATTTGAATGATTGCCTAGTGATCGG 
TGCATTTAAATTTTGTTTATATGTGACAGAAGGCGAATTTAAAGTAAGCAAAATAGC 
CTTTGCTCATGCTCGCCCGAAAGATATTACTGCCTATACCAATGTATTTACATGTCC 
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GATTGAGTTTGCTGCCGAAGATAATTATATTTATTTCGATGCTGATTTACTCGAACG 
TCCTTCTTCGCATGCGGAGCCTGAGCTATTCGCCTTACACGATCAGCTTGCAAGCCG 
TAAAATAGCCAAGTTAGAACTGCAAGATTTAGTGGATAAAGTACGTAAGGTTATTGC 
ACAACAACTTGAGTCTGGTGTGGTGACTTTAGAAAGTATCGCCACTGAACTTGACAT 
GAAACCACGTATGCTAAGAGCGAAGTTAGCTGACATTGATTATAACTTTAATCAAAT 
ACTCGCTGATTTTCGTTGCGAGTTATCAAAAAAACTGTTGGCGAATACGGACGAGTC 
TATTGATCAGATTGTCTATCTCACTGGTTTTTCTGAACCAAGTACTTTTTATCGTGC 
CTTTAAGCGCTGGGTTAAAATGACGCCAATTGAATATCGCCGTAGCAAACTCGCGGT 
TAGGCATGCTAATCAACACGAGTCCTAAAAATTCGCTGCTTAGTGCATAGTGCATAG 
TGCATAGTGCTAGTAAGCCAAGTACAAAGCGTTAAAGTTAAGTACTTGAGCGAACCA 
TCAGACACCACTTACTAGATTAAGCACCTATTAATGATTGACCACAAATTCTGATCG 
TATTGCCTGTGATCCCTGCAGCTTGAGGTTGCGCAAAAAAAGCTATCGCTTCAGCAA 
CATCAACTGGCTTACCACCTTGTTTTAATGAATTCATACGACGACCAGCTTCACGAA 
CTGTAAATGGAATCGCTGCTGTCATTTTTGTTTCAATAAAGCCTGGTGCAACAGCAT 
TAATGGTGATGTATTTGTCTGCAAGCGGAGTTTGCATTGCATCAACATAACCAATGA 
CTGCGGCCTTAGACGTTGCATAATTAGTCTGACCAAAGTTACCCGCAATCCCACTCA 
TCGAAGACACACAAACAATGCGGCCATAGTCGTTGAGCAGATCATCATTTAGCAGTC 
GCTCATTGATTCTTTCCATTGCCGACAAGTTAATATCCATCAGTACATCCCAATGGT 
TATCCGGCATACGTGCTAGCGTTTTGTCTTTTGTTACCCCGGCATTATGGAGGATGA 
TATCAAGCGACTGTTCTCGCACAAAGTCAGCAATGATATTTGGGGCGTCAGCAGCGG 
TAATATCAGCAACAATGCTGCTACCTTTCAAGCAATGAGCTACTTTTTCAAGGTCCT 
GTTTTAATGCCGGAATGTCTAAGCAAATAACATGTGCGCCATCACGGGCGAGTGTTT 
CAGCAATAGCAGCCCCGATGCCACGTGATGCACCAGTGACAAGTGCTGTCTTTCCTT 
GTAATGGTTTTGCCGTGTTACTTGTTTCGTTAATAACTTCGTTAATAACTTCGTTAA 
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TAACTTCCTTAATAGCCCCATTAATCGAACCGGGTTTTACGTTAATAACCTGTGCTG 
AGATATAGGCTGATTTTGCTGAGGTTAAGAAACGTAGCGGGGCCTCTAATAATTGCT 
CACTACCAGGTTGTACATAGATAAGTTOACAGGTACTACCATTCTTGCCTATTTCTT 
TGGCGACACTGCGACAAAACCCTTCTAAAGATCTTTGTACAGTCGCGTAGCTTACAT 
CGTCAAGATOTTCACTCGGATGACCTAACACGATCACTCTGCTGCATGGCGAGAGCT 
OCITAATTACAGGTrGAAAAAAACGATGTAATGCACTTAATTGCTTGCTGTTCTTAA 
TGCCTGAGGCGTCGAAGATAATACCGTTGAAGCGATCTGTTTTAGCGATAGCATTAA 
SGCTAATAGGTGTCGCGACTAAAGACGTTTGATTAAATTCAATATTAAGATCGGCTA 
ACGCTGACGTGTTATTAGGATAAGAAATCGTGACTTCAGCATCTTTAAATGTGTTAA 
OAATGGGTTTAATTAATTTGCTGTTGCTGGCTGCGCCGATGAGTAAGTTGCCAGAGA 
-AGATCGGTTCCCTGATCGTAGCGTGTTAACGTAACCGGTCGTGGCAGATTAAGCG 



CTTi^TAAACCTGATGTCCACTTGCCATTAGCGAGTTTTGCGTATGTATCCGTCA 
^ \TGTGTTAAA 



•fTTrCTAATCCrTGTTATAGTGAACAGTTrOAATCTCGAAGATGTACAT 
AATTATCTGATAGCTATGACTTATCTGCCACTACGTAATAATAAATAGACCAGTTCA 
TTACATCGTTAATCGATATAGTATAACTAAATACTAAGTAAATTATAATGATAAGAC 
TGTTATCGTACTCGGATCAAACTCTGATCAGCAAATAATCAAATTAGAGTTTTTATT 
rrAAACTTGTATCAACAATGTTACATTAATGTATCTTACGTCTAATGTGCTACGGGC 
^.TATTTAAGTCACTAAATTAAAGGAATAAACCATGACAGGTCAAACAATAAGAAGAG 
TAGCAATTATCGGCGGTAACCGTATCCCGTTTGCACGTTCAAATACAGCGTATTCAA 
AACTAAGTAACCAAGATATGCTGACGGAAACTATCCGTGGCITGGTGGTTAAATATA 
ACCTACGTGGTGAACAACTGGGGGAAGTTGTTGCTGGTGCGGTAATTAAGCATTCTC 
GTGATTTrAACTTAACACGTGAAGCCGTGCTAAGTGCAGGTCTTGCACCTGAAACGC 
CTTGTTATGACATTCAACAAGCTTGTGGTACTGGTCTAGCrGCAGCrATCCAAGTAG 
CAAACAAAATTGCGCnGGTCAAATAGAAGCGGGTATrGCTGGTGGTTCTGATACGA 
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CATCAGATGCACCGATTGCAGTCAGTGAAGGCATGCGTAGTGTATTACTTGAGCTTA 

ATCGAGCTAAAACGGGTAAGCAACGTTTGAAAGCACTATCTCGTCTACGTCTAAAAC 

ACTTTGCGCCACTAACGCCTGCAAATAAAGAGCCGCGTACCAAAATGGCGATGGGCG 

ATCATTGTCAAGTAACAGCGAAAGAGTGGAATATCTCACGTGAAGCACMGATGCAT 

TGGCCTGCGCAAGTCATCAAAAATTAGCTGCAGCATATGAAGAAGGTTTCTTTGATA 

CGTTAGTTTCACCTATGGCCGGCTTAACGAAAGATAACGTATTACGCGCAGATACAA 

CAGTTGAGAAACTGGCTAAATTGAAACCTTGTTTTGATAAAGTAAACGGCACTATGA 

CGGCGGGTAACAGTACTAACCTTACCGATGGAGCATCAGCTGTATTACTTGCAAGTG 

AAGAATGGGCAGCGGCACATAACTTACCAGTACAAGCTTATCTAACATTTGGTGAAA 

CGGCCGCTATCGACTTCGTTGATAAGAAAGAAGGTCTGTTAATGGCGCCTGCATACG 

CAGTGCCAAAAATGTTGAAGCGTGCTGGCCTTACATTACAAGACTTCGATTACTATG 

AAATACATGAAGCATTTGCTGCGCAGTTATTAGCAACGCTAGCAGCTTGGGAAGACG 

AAAAATTCTGTAAAGAAAAACTGGGTCTAGATGCTGCGCTTGGTTCAATTGATATGA 

CCAAGTTAAACGTGAAAGGGAGTAGCTTAGCCACGGGTCACCCATTTGCCGCAACTG 

GTGGTCGTGTTGTCGCTACGCTAGCGCAATTACTTGATCAGAAAGGTTCAGGTCGTG 

GTTTGATCTCGATTTGTGCTGCTGGTGGTCAAGGTATCACGGCAATTTTAGAGAAAT 

AAACGCACTGTTTATTATCTATTGATTAAGCTGTCCTGAGATACTGGATATTTTTAA 

ATAAAACGCCAATACTGCAGAGTATTGGCGTTTTTTTGTAATACCAATTCCTATATA 

ACGGTGCATTTTAAACACTTAATTTCCGGCATTGGTATCATAAAAAAGCAGCACCGA 

AGTGCTGCTTGATTGTAGATTAACCTATTAAAATAGAGAGGCTAGAATTAGTCTTGG 

TATGCTTCATTATGTACGCCAGCTGCACGACCCGATGGATCAGCATTGTTTTGGAAA 

CTTTCATCCCAAGCTAATGCTTCTACAGTTGAACAAGCAACGGATTTACCAAACGGT 

ACGCATTTCGCTGCTGAATCACCTGGGAAGTGATCTTCAAAGATGGCACGATAGTAG 

TAACCTTCTTTCGTATCTGGTGTGTTAATTGGGAACTTAAATGCTGCACTTGCTAAC 

ATTTGATCAGTTACCGCTTCTTCAACGTGTACTTTAAGTTGGTCAATCCAAGAATAA 
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CCAACACCATCAGAGAATTGTTCTTTTTGACGCCATACAATTTCTTCAGGTAGTAAA 

TCTTCAAATGCTTCTCGAATGATGTTTTTCTCAATGCGGTCGCCCGTGATCATTTTT 

AGTTCAGGGTTTAGACGCATTGACGCATCAACAAATTCTTTATCTAAGAAAGGAACA 

CGTGCTTCGATGCCCCAAGCTGCCATAGATTTGTTTGCACGTAAGCAATCAAACATA 

TGTAATTTATTTACTTTACGTACCGTCTCTTCATGGAATTCTTTCGCATTTGGCGCT 

TTGTGGAAGTACAAGTAACCACCGAACAGTTCATCAGCACCTTCACCAGAAAGCACC 

ATCTTAATCCCCATGGCTTTAATTTTACGTGCCATTAGGTACATAGGGGTTGATGCA 

CGAATTGTTGTTACATCGTAGGTTTCAATGTGGTAAATCACGTCGCGTAAAGCGTCG 

ATACCTTCTTGCACAGTAAATTCAATTGAATGATGGATAGTACCTAAGTGATCTGCC 

ACTTTTTGTGCAGCGGCTAAATCTGGAGAACCATTTAGGCCTACAGAGAAAGAGTGT 

AGTTGTGGCCACCATGCTTCGGTTTTACCACCGTCTTCAATACGACGTTTTGCATAC 

TGTTGGGTGATTGCTGAAATAACAGATGAATCTAACCCGCCTGATAATAATACGCCG 

TAAGGTACATCACACATTAATTGACGTTTAACTGCATCrrTCCAAACCITGCTTAACA 

ACGCTTTTATCACCACCATTTTGTGCAACGTTATCAAAATCTTTCCAATCACGTTGA 

TAATAAGGCGTGACTACACCATCCTTACTCCACAGGTAATGACCTGCTGGGAATTCT 

TCAATTTGAGTACAAATTGGCACTAGTGCTTTCATTTCAGAGGCAACATAAAAGTTA 

CCGTGTTCATCATAGCCCGTATAAAGAGGGATGATACCGATATGGTCACGGCCAATC 

AGGTAAGCGTCCTCTGTTTCGTCATATAAAGCGAAAGCAAAAATACCATTTAGATCA 

TCTAAAAATTGTGTGCCTTTTTCTTTATATAGCGCAAGTATCACTTCGCAATCTGAT 

TCTGTTTGGAATTCAAAGTCTACGTTCAGCGTTTTCTTTAAATCTTTGTGGTTATAA 

ATTTCACCATTAACAGCAAGTACGTGTGTCITTTCTTCATTATATAGCGGCTGTGCA 

CCATTATTTACATCGACAATAGCAAGACGTTCATGAACTAAAATAGCATTGTCACTT 

GTATAGATACCTGACCAATCTGGGCCGCGGTGACGTAGTAACTTTGATAGTTCTAGT 

GCTTGTTCGCGAAGAGGTTTAATGTCTGATTTGATGTCTAGAATTCCGAATATTGAG 
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CACATAACTAATTCCTTCTGGGGCTGCGTCTGCAGCTAACTTTCTAAATAGTGTGTC 
TAATTTGCCACATTGTAGATTTAATGCAAACATTAATGATAAAACATTTATAAAAAA 
TGTAATTCAATGTGGAATCGATAATTTAATGGCTTAAAAGTGAAGATCCATTAATTG 
TGATGGCGAGGTGATAGACCAATGTAGACCTTAATGAATAAAGCAGGCACGATTGAA 
TCCATTCAACGCAAAGTGGTACTAACTATTGTTTTAAACGTTATAAATAGTGTTTTA 
AAGGTTATAAGTAAATAATTTAAAAACAATAATAATCCACATGCATTAAATTTATCA 
TGATAAACCGCTATATCTCAATGGCAATTTGGGATAAGTGTAAAATATATGTAAAAT 
GAATGAGTTGACTTGCTTTTTTTACACTAAGTGATGAAATTAAAGCTAGATGTCGTT 
GTTAGCATTGATTAATAACGTACTAAAATACGACATCTAGTATAGAAATTTAAAAAA 
CAGTTGGTTTTGATAGCATAACTGCATAAACTAATCAGCTTATTGTCTGTAATATTT 
TTGTAATTTAAATAGGTTTAATAAAATTATATGTCTGATAAATATAAACCGTACGAC 
CTTTCCTTTAAAAAGACGTTTTTGCTGCCTAAGTTTTGGCCTGTGTGGTTCGGGGTG 
TTTGCAATATACTTATTAGCTTTTATGCCAGTAAAGCCGCGTGATAAATTTGCTCGA 
TTCATAGCGAAGAAATTGTTTAGTCTAAAAATGATGGCAAAGCGTAAAAAGGTAGCA 
AAGATCAATTTATCTATGTGCTTCCCTGAAATGGATGATACGGAACAAGACCGTATA 
ATCATGGTCAATCTAGTTACTTTTTGTCAAACTATCTTAAGTTATGCAGAGCCAAGT 
GCGCGTAGTCGTGCTTATAACCGTGACCGTATGATAGTGCATGGTGGCGAGAATTTA 
TTTCCGCTACTTGAACAAGGTAAGGCTTGTATCTTATTAGTGCCGCATAGCTTCGCT 
ATTGATTTTGCAGGTTTACACATTGCTTCrTATGGCGCGCCATTTTGTACTATGTTT 
AACAATTCTGAGAATGAGTTGTTCGATTGGCTGATGACACGTCAACGCGCTATGTTT 
GGAGGCACTGTTTATCACCGCAAGGCAGGGCTAGGGGCTCTAGTTAAATCACTTAAG 
AGCGGTGAAAGCTGTTATTACTTACCTGATGAAGACCATGGACCTAAGCGTAGTGTA 
TTTGCGCCTTTATTTGCGACTCAAAAAGCAACTTTACCTGTAATGGGCAAGCTAGCA 
GAAAAAACAAATGCACTCGTTGTTCCTGTTTATGCGGCATATAATGAATCACTAGGT 
AAATTTGAAACCTTTATTCGACCAGCAATGCAAAACTTTCCATCAGAAAGCCCAGAA 
CAAGATGCAGTGATGATGAATAAAGAGATTGAAGCCTTGATTGAATGTGGTGTTGAT 
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CAATATATGTGGACACTTAGATTATTGAGAACACGTCCGGACGGTAAAAAAATCTAC 
TAATAAAGTTTAATAAACACCATAATCTTCGTTGAATATGGTGTTTACCCCCCTGAA 
TACCCTCTAAATTAATAACAAAAAAAGCCATTTACGTAACATCTAATGATGATTTAG 
CCTGCACTTGCTTTGTTTTTAGTCTTAAGAGCCTAATAAACTTGATCTAGGTATAGA 
TTCTGTCTTTCTTTACGTAACGCGATCTATTTTTTTTAACCGATAGTTGTTATAATT 
AGTTTCATATGAAAGAGATATCGTTTCAGTAAAAGCTATTTCGTTTCAATAGATAAT 
TTATTTATAGTCATATTTTCTGTAATGACAATCATTTTCTCATCTAGACTATAGATA 
AGAATACGAATTAAGTAAGAACATTAATTTTACAAGAATATAAAATATCCCATCGGA 
GCTATAAGAATGAAAAAGACTAAAATTGTTTGTACAATTGGTCCAAAAACTGAATCA 
GTAGAGAAACTAACAGAGCTTGTTAATGCAGGCATGAACGTTATGCGTTTAAATTTC 
TCTCATGGTAACTTTGCTGAACATTCAGTGCGTATTCAAAATATCCGTCAAGTAAGT 
GAAAACCTGAATAAGAAAATTGCTGTTTTACTGGATACTAAAGGTCCAGAAATCCGT 
ACGATTAAACTAGAAAACGGTGACGATGTAATGTTGACCGCTGGTCAGTCATTCACG 
TTTACAACAGACATTAACGTGGTAGGTAATAAAGACTGTGTTGCTGTAACATATGCT 
GGTTTTGCTAAAGACCTTAATCCTGGTGCAATCATCCTTGTTGATGATGGTTTAATT 
GAAATGGAAGTTGTTGCAACAACTGACACTGAAGTTAAATGTACAGTATTAAATACT 
GGTGCACTTGGTGAAAATAAAGGCGTTAACTTACCTAACATCAGTGTAGGTCTACCT 
GCATTGTCAGAAAAAGATAAAGCTGATTTAGCGTTTGGTTGTGAGCAAGAAGTTGAT 
TTTGTTGCTGCATCATTTATTCGTAAGGCTGATGATGTAAGAGAAATTCGTGAAATC 
CTATTTAATAATGGTGGCGAAAACATTCAGATTATCTCGAAAATTGAAAACCAAGAA 
GGTGTAGACAATTTCGATGAAATCTTAGCTGAATCAGACGGTATCATGGTTGCTCGT 
GGCGATCTCGGTGTTGAGATCCCAGTTGAAGAAGTGATCATGGCACAGAAGATGATG 
ATCAAAAAATGTAATAAAGCAGGTAAAGTTGTAATTACTGCAACACAAATGCTTGAT 
TCAATGATCAGTAACCCACGTCCAACACGTGCAGAAGCGGGCGATGTTGCCAATGCT 
GTGCTTGACGGTACCGACGCGGTAATGCTTTCTGGTGAAACTGCGAAAGGTAAATAC 
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CCAGTTGAAGCTGTGTCTATCATGGCAAACATCTGTGAACGTACTGATAACTCAATG 

TCTTCGGATTTAGGTGCGAACATTGTTGCTAAAAGCATGCGCATTACAGAAGCTGTG 

TGTAAAGGTGCGGTAGAAACAACAGAAAAATTGTGTGCTCCACTTATTGTTGTTGCA 

ACTCGTGGCGGTAAATCAGCAAAATCTGTTCGTAAATACTTCCCGAAAGCAAATATT 

CTTGCTATCACAACAAATGAAAAAGCAGCGCAACAGTTATGCCTAACTAAAGGCGTA 

AGCAGCTGCATCGTTGAGCAGATTGATAGCACTGATGAGTTCTACCGTAAAGGTAAA 

GAGCTTGCATTAGCAACTGGTTTAGCTAAAGAAGGCGATATCGTTGTTATGGTATCA 

GGTGCGTTAGTACCATCAGGTACAACGAATACGGCATCTGTTCACCAACTTTAAGTT 

GCCATATTGATATTATAAAAAAGAGAGCGTATGCTCTCTTTTTTTATATCTGTAGTT 

TATATGTCTGTACAAAAAAATGATAAAGAGTACATAAACTATTAATATAGCGTAATA 

TATAATGATTAACGGTGATGAAAGGGTTAAATAAATGGATAGTGCTAAACATAAAAT 

TGGCTTAGTCCTTTCTGGCGGTGGTGCGAAAGGTATTGCTCATCTTGGTGTATTAAA 

A'tACCTGTTAGAGCAAGATATAAGACCGAATGTAATTGCGGGTACAAGTGCTGGCTC 

TATGGTTGGTGCACTTTATTGCTCAGGACTTGAGATTGATGACATTTTACAATTCTT 

CATCGATGTAAAACCTTTTTCTTGGAAGTTTACCCGTGCCCGTGCTGGCTTTATAGA 

CCCGGCAAAATTATATCCTGAAGTGCTAAAATATATCCCCGAGGATAGCTTTGAGTA 

CCTTCAACCTGAATTGCGCATTGTTGCCACCAACATGTTACTCGGTAAAGAGCATAT 

ATTTAAAGATGGCTCCGTGATTAATGCCTTATTAGCATCAGCCAGCTACCCTTTAGT 

TTTTTCTCCGATGATCATTGACGATCAAGTGTATTCAGATGGCGGTATTGTTAATCA 

TTTCCCCGTGAGTGTCATTGAAGATGATTGCGATAAAATAATCGGCGTATACGTGTC 

GCCCATTCGTCAGGTCGAAGCTGACGAACTCTCGAGTATAAAAGACGTGG.TATTACG 

TGCGTTCACGCTGCAGGGTAGTGGTGCTGAATTAGATAAACTATCGCAATGTGATGT 

GCAAATTTATCCAGAAGCGCTATTGAATTACAATACGTTTGCAACCGATGAAAAATC 

ATTACGGGAGATCTACCAGATTGGTTATGATGCTGCAAAAGATCAACATGACAACCT 

TATGGCATTGAAAGAAAGTATCACCACCAGCGAGGTTAAAAAGAACGTCTTTAGCAA 
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ATGGTTTGGTGATAAACTTGCTAGCAACAGCGGCAAATAGCGGCCCACACGGATTTA 
TACACTAGGATAATGGGCGTTAATAGCCTCACTGTCGTTGTGTGGTCTCTAATTTTA 
GCTAAATCTTGTGTTATACTGACTTCCTATTAATCATAAACGATTTATCACGGTAAA 
CATGACTCAAATAAATAACCCGCTTCACGGCATGACACTCGAAAAAGTAATTAACAG 
TCTCGTTGAACAATATGGCTGGGATGGTCTTGGATACTACATCAACATTCGTTGCTT 
TACTGAAAATCCAAGTGTTAAGTCTAGTCTTAAATTTTTACGTAAAACCCCTTGGGC 
ACGTGATAAAGTAGAAGCGCTATATAtCAAAATGGTGACTGAAGGCTAACTGTCTCC 
ACGCTAGCGAACCGCTGTTTATAGTTAATATAAGTACTATAAGCAGGGCTCGTTAAT 
TCAGTATGTAATTAATCCTGAATACCTCCGCTTATTTCAACATTGTACTCTCTAGAT 
AACACTCTCAACATTACACCTTCAACATCACAGCCTCCACATAACATCCGATGACAT 
AGCCCTGTTATTTTTCACATTTATCTATATGCTATATATTTTAGCCATTTGATCAAT 
TGAGTTAATTTCTGCAATGACAAAGATATACCATCATCCAGTACAAATTTATTATGA 
AGATACCGACCATTCTGGTGTTGTTTACCACCCTAACTTTTTAAAATACTTTGAACG 
TGCACGTGAGCATGTGATAAATAGTGACTTACTAGCAACATTGTGGAATGAACGCGG 
TTTAGGTTTTGCGGTGTATAAAGCCAATATGACTTTTCAGGATGGGGTCGAATTTGC 
TGAAGTGTGTGATATTCGCACTTCTTTTGTCCTAGACGGTAAGTACAAAACGATCTG 
GCGCCAAGAAGTATGGCGTCCGAATGCGACTAGGGCTGCCGTTATCGGTGATATTGA 
AATGGTGTGCTTAGACAAACAAAAACGTTTACAGCCCATCCCTGATGATGTGTTAGC 
TGCAATGGTTAGTGAATAAATGGTTCATGCATAAATAGTTAATACATGATTCTGGCC 
CGTCACGTTTACAGATAAGAGGCATCCGATGCCTCCTTCCTATTACCAATACTACTG 
CTTATCCCTTTCTAACTATCTTTAGCGTCCATAACACACTGAGCATTTATTCTATTA 
ATCAGTGATTGTGATTTAATTATCTTCTATATATGTAATTTAATGTAATTTTCAATT 
TATTTTTAGCTACATTAAGGCTTACGAATGTACGCTAAAATGAGATGTCAGACTAAT 
TTTAGCTTATTAATCTGTTAGCCGTTTATATTTTATAAAGATGGGATTTAACTTAAA 
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TGCAATTAATTATGGCGTAAATAGAGTGAAAACATGGCTAATATTCACTAAGTCCTG 

AATTTTATATAAAGTTTAATCTGTTATTTTAGCGTTTACCTGGTCTTATCAGTGAGG 

TTTATAGCCATTATTAGTGGGATTGAAGTGATTTTTAAAGCTATGTATATTATTGCA 

AATATAAATTGTAACAATTAAGACTTTGGACACTTGAGTTCAATTTCGAATTGATTG 

GCATAAAATTTAAAACAGCTAAATCTACCTCAATCATTTTAGCAAATGTATGCAGGT 

AGATTTTTTTCGCCATTTAAGAGTACACTTGTACGCTAGGTTTTTGTTTAGTGTGCA 

AATGAACGTTTTGATGAGCATTGTTTTTAGAGCACAAAATAGATCCTTACAGGAGCA 

ATAACGCAATGGCTAAAAAGAACACCACATCGATTAAGCACGCCAAGGATGTGTTAA 

GTAGTGATGATCAACAGTTAAATTCTCGCTTGCAAGAATGTCCGATTGCCATCATTG 

GTATGGCATCGGTTTTTGCAGATGCTAAAAACTTGGATCAATTCTGGGATAACATCG 

TTGACTCTGTGGACGCTATTATTGATGTGCCTAGCGATCGCTGGAACATTGACGACC 

ATTACTCGGCTGATAAAAAAGCAGCTGACAAGACATACTGCAAACGCGGTGGTTTCA 

TTCCAGAGCTTGATTTTGATCCGATGGAGTTTGGTTTACCGCCAAATATCCTCGAGT 

TAACTGACATCGCTCAATTGTTGTCATTAATTGTTGCTCGTGATGTATTAAGTGATG 

CTGGCATTGGTAGTGATTATGACCATGATAAAATTGGTATCACGCTGGGTGTCGGTG 

GTGGTCAGAAACAAATTTCGCCATTAACGTCGCGCCTACAAGGCCCGGTATTAGAAA 

AAGTATTAAAAGCCTCAGGCATTGATGAAGATGATCGCGCTATGATCATCGACAAAT 

TTAAAAAAGCCTACATCGGCTGGGAAGAGAACTCATTCCCAGGCATGCTAGGTAACG 

TTATTGCTGGTCGTATCGCCAATCGTTTTGATTTTGGTGGTACTAACTGTGTGGTTG 

ATGCGGCATGCGCTGGCTCCCTTGCAGCTGTTAAAATGGCGATCTCAGACTTACTTG 

AATATCGTTCAGAAGTCATGATATCGGGTGGTGTATGTTGTGATAACTCGCCATrCA 

TGTATATGTCATTCTCGAAAACACCAGCATTTACCACCAATGATGATATCCGTCCGT 

TTGATGACGATTCAAAAGGCATGCTGGTTGGTGAAGGTATTGGCATGATGGCGTTTA 

AACGTCTTGAAGATGCTGAACGTGACGGCGACAAAATTTATTCTGTACTGAAAGGTA 

TCGGTACATCTTCAGATGGTCGTTTCAAATCTATTTACGCTCCACGCCCAGATGGCC 

AAGCAAAAGCGCTAAAACGTGCTTATGAAGATGCCGGTTTTGCCCCTGAAACATGTG 
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GTCTAATTGAAGGCCATGGTACGGGTACCAAAGCGGGTGATGCCGCAGAATTTGCTG 
GCTTGACCAAACACTTTGGCGCCGCCAGTGATGAAAAGCAATATATCGCCTTAGGCT 
CAGTTAAATCGCAAATTGGTCATACTAAATCTGCGGCTGGCTCTGCGGGTATGATTA 
AGGCGGCATTAGCGCTGCATCATAAAATCTTACCTGCAACGATCCATATCGATAAAC 
CAAGTGAAGCCTTGGATATCAAAAACAGCCCGTTATACCTAAACAGCGAAACGCGTC 
CTTGGATGCCACGTGAAGATGGTATTCCACGTCGTGCAGGTATCAGCTCATTTGGTT 
TTGGCGGCACCAACTTCCATATTATTTTAGAAGAGTATCGCCCAGGTCACGATAGCG 
CATATCGCTTAAACTCAGTGAGCCAAACTGTGTTGATCTCGGCAAACGACCAACAAG 
GTATTGTTGCTGAGTTAAATAACTGGCGTACTAAACTGGCTGTCGATGCTGATCATC 
AAGGGTTTGTATTTAATGAGTTAGTGACAACGTGGCCATTAAAAACCCCATCCGTTA 
ACCAAGCTCGTTTAGGTTTTGTTGCGCGTAATGCAAATGAAGCGATCGCGATGATTG 
ATACGGCATTGAAACAATTCAATGCGAACGCAGATAAAATGACATGGTCAGTACCTA 
CCGGGGTTTACTATCGTCAAGCCGGTATTGATGCAACAGGTAAAGTGGTTGCGCTAT 
T'CTCAGGGCAAGGTTCGCAATACGTGAACATGGGTCGTGAATTAACCTGTAACTTCC 
CAAGCATGATGCACAGTGCTGCGGCGATGGATAAAGAGTTCAGTGCCGCTGGTTTAG 
GCCAGTTATCTGCAGTTACTTTCCCTATCCCTGTTTATACGGATGCCGAGCGTAAGC 
TACAAGAAGAGCAATTACGTTTAACGCAACATGCGCAACCAGCGATTGGTAGTTTGA 
GTGTTGGTCTGTTCAAAACGTTTAAGCAAGCAGGTTTTAAAGCTGATTTTGCTGCCG 
GTCATAGTTTCGGTGAGTTAACCGCATTATGGGCTGCCGATGTATTGAGCGAAAGCG 
ATTACATGATGTTAGCGCGTAGTCGTGGTCAAGCAATGGCTGCGCCAGAGCAACAAG 
ATTTTGATGCAGGTAAGATGGCCGCTGTTGTTGGTGATCCAAAGCAAGTCGCTGTGA 
TCATTGATACCCTTGATGATGTCTCTATTGCTAACTTCAACTCGAATAACCAAGTTG 
TTATTGCTGGTACTACGGAGCAGGTTGCTGTAGCGGTTACAACCTTAGGTAATGCTG 
GTTTCAAAGTTGTGCCACTGCCGGTATCTGCTGCGTTCCATACACCTTTAGTTCGTC 
ACGCGCAAAAACCATTTGCTAAAGCGGTTGATAGCGCTAAATTTAAAGCGCCAAGCA 
TTCCAGTGTTTGCTAATGGCACAGGCTTGGTGCATTCAAGCAAACCGAATGACATTA 
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AGAAAAACCTGAAAAACCACATGCTGGAATCTGTTCATTTCAATCAAGAAATTGACA 
ACATCTATGCTGATGGTGGCCGCGTATTTATCGAATTTGGTCCAAAGAATGTATTAA 
CTAAATTGGTTGAAAACATTCTCACTGAAAAATCTGATGTGACTGCTATCGCGGTTA 
ATGCTAATCCTAAACAACCTGCGGACGTACAAATGCGCCAAGCTGCGCTGCAAATGG 
CAGTGCTTGGTGTCGCATTAGACAATATTGACCCGTACGACGCCGTTAAGCGTCCAC 
TTGTTGCGCCGAAAGCATCACCAATGTTGATGAAGTTATCTGCAGCGTCTTATGTTA 
GTCCGAAAACGAAGAAAGCGTTTGCTGATGCATTGACTGATGGCTGGACTGTTAAGC 
AAGCGAAAGCTGTACCTGCTGTTGTGTCACAACCACAAGTGATTGAAAAGATCGTTG 
AAGTTGAAAAGATAGTTGAACGCATTGTCGAAGTAGAGCGTATTGTCGAAGTAGAAA 
AAATCGTCTACGTTAATGCTGACGGTTCGCTTATATCGCAAAATAATCAAGACGTTA 
ACAGCGCTGTTGTTAGCAACGTGACTAATAGCTCAGTGACTCATAGCAGTGATGCTG 
ACCTTGTTGCCTCTATTGAACGCAGTGTTGGTCAATTTGTTGCACACCAACAGCAAT 
TATTAAATGTACATGAACAGTTTATGCAAGGTCCACAAGACTACGCGAAAACAGTGC 
AGAACGTACTTGCTGCGCAGACGAGCAATGAATTACCGGAAAGTTTAGACCGTACAT 
luTCTATGTATAACGAGTTCCAATCAGAAACGCTACGTGTACATGAAACGTACCTGA 
ACAATCAGACGAGCAACATGAACACCATGCTTACTGGTGCTGAAGCTGATGTGCTAG 
CAACCCCAATAACTCAGGTAGTGAATACAGCCGTTGCCACTAGTCACAAGGTAGTTG 
CTCCAGTTATTGCTAATACAGTGACGAATGTTGTATCTAGTGTCAGTAATAACGCGG 
CGGTTGCAGTGCAAACTGTGGCATTAGCGCCTACGCAAGAAATCGCTCCAACAGTCG 
CTACTACGCCAGCACCCGCATTGGTTGCTATCGTGGCTGAACCTGTGATTGTTGCGC 
ATGTTGCTACAGAAGTTGCACCAATTACACCATCAGTTACACCAGTTGTCGCAACTC 
AAGCGGCTATCGATGTAGCAACTATTAACAAAGTAATGTTAGAAGTTGTTGCTGATA 
AAACCGGTTATCCAACGGATATGCTGGAACTGAGCATGGACATGGAAGCTGACTTAG 
GTATCGACTCAATCAAACGTGTTGAGATATTAGGCGCAGTACAGGAATTGATCCCTG 
ACTTACCTGAACTTAATCCTGAAGATCTTGCTGAGCTACGCACGCTTGGTGAGATTG 
TCGATTACATGAATTCAAAAGCCCAGGCTGTAGCTCCTACAACAGTACCTGTAACAA 
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GTGCACCTGTTTCGCCTGCATCTGCTGGTATTGATTTAGCCCACATCCAAAACGTAA 
TGTTAGAAGTGGTTGCAGACAAAACCGGTTACCCAACAGACATGCTAGAACTGAGCA 
TGGATATGGAAGCTGACTTAGGTATTGATTCAATCAAGCGTGTGGAAATCTTAGGTG 
CAGTACAGGAGATCATAACTGATTTACCTGAGCTAAACCCTGAAGATCTTGCTGAAT 
TACGCACCCTAGGTGAAATCGTTAGTTACATGCAAAGCAAAGCGCCAGTCGCTGAAA 
GTGCGCCAGTGGCGACGGCTCCTGTAGCAACAAGCTCAGCACCGTCTATCGATTTGA 
ACCACATTCAAACAGTGATGATGGATGTAGTTGCAGATAAGACTGGTTATCCAACTG 
ACATGCTAGAACTTGGCATGGACATGGAAGCTGATTTAGGTATCGATTCAATCAAAC 
GTGTGGAAATATTAGGCGCAGTGCAGGAGATCATCACTGATTTACCTGAGCTAAACC 
CAGAAGACCTCGCTGAATTACGCACGCTAGGTGAAATCGTTAGTTACATGCAAAGCA 
AAGCGCCAGTCGCTGAGAGTGCGCCAGTAGCGACGGCTTCTGTAGCAACAAGCTCTG 
CACCGTCTATCGATTTAAACCATATCCAAACAGTGATGATGGAAGTGGTTGCAGACA 
AAACCGGTTATCCAGTAGACATGTTAGAACTTGCTATGGACATGGAAGCTGACCTAG 
GTATCGATTCAATCAAGCGTGTAGAAATTTTAGGTGCGGTACAGGAAATCATTACTG 
ACTTACCTGAGCTTAACCCTGAAGATCTTGCTGAACTACGTACATTAGGTGAAATCG 
TTAGTTACATGCAAAGCAAAGCGCCCGTAGCTGAAGCGCCTGCAGTACCTGTTGCAG 
TAGAAAGTGCACCTACTAGTGTAACAAGCTCAGCACCGTCTATCGATTTAGACCACA 
TCCAAAATGTAATGATGGATGTTGTTGCTGATAAGACTGGTTATCCTGCCAATATGC 
TTGAATTAGCAATGGACATGGAAGCCGACCTTGGTATTGATTCAATCAAGCGTGTTG 
AAATTCTAGGCGCGGTACAGGAGATCATTACTGATTTACCTGAACTAAACCCAGAAG 
ACTTAGCTGAACTACGTACGTTAGAAGAAATTGTAACCTACATGCAAAGCAAGGCGA 
GTGGTGTTACTGTAAATGTAGTGGCTAGCCCTGAAAATAATGCTGTATCAGATGCAT 
TTATGCAAAGCAATGTGGCGACTATCACAGCGGCCGCAGAACATAAGGCGGAATTTA 
AACCGGCGCCGAGCGCAACCGTTGCTATCTCTCGTCTAAGCTCTATCAGTAAAATAA 
GCCAAGATTGTAAAGGTGCTAACGCCTTAATCGTAGCTGATGGCACTGATAATGCTG 
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TGTTACTTGCAGACCACCTATTGCAAACTGGCTGGAATGTAACTGCATTOCAACCAA 

CTTGGGTAGCTGTAACAACGACGAAAGCATTTAATAAGTCAGTGAACCTGGTGACTT 

TAAATGGCGTTGATGAAACTGAAATCAACAACATTATTACTGCTAACGCACAATTGG 

ATGCAGrrATCTATCTGCACGCAAGTAGCGAAATTAATGCTATCGAATACCCACAAG 

CATCTAAGCAAGGCCTGATGTTAGCCTrCTTATTAGCGAAATTGAGTAAAGTAACTC 

AAGCCGCTAAAGTCCGTGGCGCCTTTATGATTGTTACrCAGCAGGGTGGTTCATrAG 

GTTTTGATGATATCGATICTGCTACAAGTCATGATGTGAAAACAGAGCTAGTACAAA 

GCGGCTTAAACGGTTTAGTTAAGACACTGTCrCACGAGTGGGATAACGTATTCTGTC 

GTGCGGTTGATATTGCTTCGTCATTAACGGCTGAACAAGTTGCAAGCCrrGTTAGTG 

ATGAACTACTTGATGCrAACACTGTATTAACAGAAGTGGGrrATCAACAAGCrGGTA 

A;,GGCCTrGAACGTATCACGTTAACrGGTGTGGCTACTGACAGCTATGCATTAACAG 

CrGGCAATAACATCGATGCTAACTCGGTATTTTTAGTGAGTGGTGGCGCAAAAGGTG 

TAACrGCACATTGTGTTGCrCGTATAGCTAAAGAATATCAGTCTAAGTTCATCrrAT 

TGGGACGTTCAACGTTCTCAAGTGACGAACCGAGCTGGGCAAGTGGTATTACTGATG 

W.GCGGCGTTAAAGAAAGCAGCGATGCAGTCTTTGATTACAGCAGGTGATAAACCAA 

CACCCGTTAAGATCGTACAGCTAATCAAACCAATCCAAGCTAATCGTGAAATTGCGC 

^CCrrGTCTGCAATTACCGCTGCTGGTGGCCAAGCrGAATATGTTTCTGCAGATG 

TAACTAATGCAGCAAGCGTACAAATGGCAGTCGCTCCAGCTATCGCTAAGTTCGGTG 

CAATCACTGGCATCATTCATGGCGCGGGTGTGTTAGCTGACCAATrCATTGAGCAAA 

ftAACACTGAGTGAnTTGAGTCTGTTTACAGCACTAAAATTGACGGTTTGTTATCGC 

TACTATCAGTCACTGAAGCAAGCAACATCAAGCAArrGGTATTGTrcrqGTCAGCGG 

CTGGTTTCTACGGTAACCCCGGCCAGTCTGATTACTCGATTGCCAATGAGATCTTAA 

ATAAAACCGCATACCGCTTTAAATCATTGCACCCACAAGCTCAAGTATTGAGCTTTA 

ACTGGGGTCCTIGGGACGGTGGCATGGTAACGCCTGAGCTTAAACGTATGTTTGACC 

MCGTGGTGTTTACATTATTCCACTTGATGCAGGTGCACAGTTATTGCTGAATGAAC 
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TAGCCGCTAATGATAACCGTTGTCCACAAATCCTCGTGGGTAATGACTTATCTAAAG 

ATGCTAGCTCTGATCAAAAGTCTGATGAAAAGAGTACTGCTGTAAAAAAGCCACAAG 

TTAGTCGTTTATCAGATGCTTTAGTAACTAAAAGTATCAAAGCGACTAACAGTAGCT 

CTTTATCAAACAAGACTAGTGCTTTATCAGACAGTAGTGCTTTTCAGGTTAACGAAA 

ACCACTTTTTAGCTGACCACATGATCAAAGGCAATCAGGTATTACCAACGGTATGCG 

CGATTGCTTGGATGAGTGATGCAGCAAAAGCGACTTATAGTAACCGAGACTGTGCAT 

TGAAGTATGTCGGTTTCGAAGACTATAAATTGTTTAAAGGTGTGGTTTTTGATGGCA 

ATGAGGCGGCGGATTACCAAATCCAATTGTCGCCTGTGACAAGGGCGTCAGAACAGG 

ATTCTGAAGTCCGTATTGCCGCAAAGATCTTTAGCCTGAAAAGTGACGGTAAACCTG 

TGTTTCATTATGCAGCGACAATATTGTTAGCAACTCAGCCACTTAATGCTGTGAAGG 

TAGAACTTCCGACATTGACAGAAAGTGTTGATAGCAACAATAAAGTAACTGATGAAG 

CACAAGCGTTATACAGCAATGGCACCTTGTTCCACGGTGAAAGTCTGCAGGGCATTA 

AGCAGATATTAAGTTGTGACGACAAGGGCCTGCTATTGGCTTGTCAGATAACCGATG 

TTGCAACAGCTAAGCAGGGATCCTTCCCGTTAGCTGACAACAATATCTTTGCCAATG 

ATTTGGTTTATCAGGCTATGTTGGTCTGGGTGCGCAAACAATTTGGTTTAGGTAGCT 

TACCTTCGGTGACAACGGCTTGGACTGTGTATCGTGAAGTGGTTGTAGATGAAGTAT 

TTTATCTGCAACTTAATGTTGTTGAGCATGATCTATTGGGTTCACGCGGCAGTAAAG 

CCCGTTGTGATATTCAATTGATTGCTGCTGATATGCAATTACTTGCCGAAGTGAAAT 

CAGCGCAAGTCAGTGTCAGTGACATTTTGAACGATATGTCATGATCGAGTAAATAAT 

AACGATAGGCGTCATGGTGAGCATGGCGTCTGCTTTCTTCATTTTTTAACATTAACA 

ATATTAATAGCTAAACGCGGTTGCTTTAAACCAAGTAAACAAGTGCTTTTAGCTATT 

ACTATTCCAAACAGGATATTAAAGAGAATATGACGGAATTAGCTGTTATTGGTATGG 

ATGCTAAATTTAGCGGACAAGACAATATTGACCGTGTGGAACGCGCTTTCTATGAAG 

GTGCTTATGTAGGTAATGTTAGCCGCGTTAGTACCGAATCTAATGTTATTAGCAATG 

GCGAAGAACAAGTTATTACTGCCATGACAGTTCTTAACTCTGTCAGTCTACTAGCGC 
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AAACGAATCAGTTAAATATAGCTGATATCGCGGTGTTGCTGATTGCTGATGTAAAAA 
GTGCTGATGATCAGCTTGTAGTCCAAATTGCATCAGCAATTGAAAAACAGTGTGCGA 
GTTGTGTTGTTATTGCTGATTTAGGCCAAGCATTAAATCAAGTAGCTGATTTAGTTA 
ATAACCAAGACTGTCCTGTGGCTGTAATTGGCATGAATAACTCGGTTAATTTATCTC 
GTCATGATCTTGAATCTGTAACTGCAACAATCAGCTTTGATGAAACCTTCAATGGTT 
ATAACAATGTAGCTGGGTTCGCGAGTTTACTTATCGCTTCAACTGCGTTTGCCAATG 
CTAAGCAATGTTATATATACGCCAACATTAAGGGCTTCGCTCAATCGGGCGTAAATG 
CTCAATTTAACGTTGGAAACATTAGCGATACTGCAAAGACCGCATTGCAGCAAGCTA 
GCATAACTGCAGAGCAGGTTGGTTTGTTAGAAGTGTCAGCAGTCGCTGATTCGGCAA 
TCGCATTGTCTGAAAGCCAAGGTTTAATGTCTGCTTATCATCATACGCAAACTTTGC 
ATACTGCATTAAGCAGTGCCCGTAGTGTGACTGGTGAAGGCGGGTGTTTTTCACAGG 
TCGCAGGTTTATTGAAATGTGTAATTGGTTTACATCAACGTTATATTCCGGCGATTA 
AAGATTGGCAACAACCGAGTGACAATCAAATGTCACGGTGGCGGAATTCACCATTCT 
ATATGCCTGTAGATGCTCGACCTTGGTTCCCACATGCTGATGGCTCTGCACACATTG 
CCGCTTATAGTTGTGTGACTGCTGACAGCTATTGTCATATTCTTTTACAAGAAAACG 
TCTTACAAGAACTTGTTTTGAAAGAAACAGTCTTGCAAGATAATGACTTAACTGAAA 
GCAAGCTTCAGACTCTTGAACAAAACAATCCAGTAGCTGATCTGCGCACTAATGGTT 
ACTTTGCATCGAGCGAGTTAGCATTAATCATAGTACAAGGTAATGACGAAGCACAAT 
TACGCTGTGAATTAGAAACTATTACAGGGCAGTTAAGTACTACTGGCATAAGTACTA 
TCAGTATTAAACAGATCGCAGCAGACTGTTATGCCCGTAATGATACTAACAAAGGCT 
ATAGCGCAGTGCTTATTGCCGAGACTGCTGAAGAGTTAAGCAAAGAAATAACCTTGG 
CGTTTGCTGGTATCGCTAGCGTGTTTAATGAAGATGCTAAAGAATGGAAAACCCCGA 
AGGGCAGTTATTTTACCGCGCAGCCTGCAAATAAACAGGCTGCTAACAGCACACAGA 
ATGGTGTCACCTTCATGTACCCAGGTATTGGTGCTACATATGTTGGTTTAGGGCGTG 
ATCTATTTCATCTATTCCCACAGATTTATCAGCCTGTAGCGGCTTTAGCCGATGACA 
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TTGGCGAAAGTCTAAAAGATACTTTACTTAATCCACGCAGTATTAGTCGTCATAGCT 
TTAAAGAACTCAAGCAGTTGGATCTGGACCTGCGCGGTAACTTAGCCAATATCGCTG 
AAGCCGGTGTGGGTTTTGCTTGTGTGTTTACCAAGGTATTTGAAGAAGTCTTTGCCG 
TTAAAGCTGACTTTGCTACAGGTTATAGCATGGGTGAAGTAAGCATGTATGCAGCAC 
TAGGCTGCTGGCAGCAACCGGGATTGATGAGTGCTCGCCTTGCACAATCGAATACCT 
TTAATCATCAACTTTGCGGCGAGTTAAGAACACTACGTCAGCATTGGGGCATGGATG 
ATGTAGCTAACGGTACGTTCGAGCAGATCTGGGAAACCTATACCATTAAGGCAACGA 
TTGAACAGGTCGAAATTGCCTCTGCAGATGAAGATCGTGTGTATTGCACCATTATCA 
ATACACCTGATAGCTTGTTGTTAGCCGGTTATCCAGAAGCCTGTCAGCGAGTCATTA 
AGAATTTAGGTGTGCGTGCAATGGCATTGAATATGGCGAACGCAATTCACAGCGCGC 
CAGCTTATGCCGAATACGATCATATGGTTGAGCTATACCATATGGATGTTACTCCAC 
GTATTAATACCAAGATGTATTCAAGCTCATGTTATTTACCGATTCCACAACGCAGCA 
AAGCGATTTCCCACAGTATTGCTAAATGTTTGTGTGATGTGGTGGATTTCCCACGTT 
TGGTTAATACCTTACATGACAAAGGTGCGCGGGTATTCATTGAAATGGGTCCAGGTC 
GTTCGTTATGTAGCTGGGTAGATAAGATCTTAGTTAATGGCGATGGCGATAATAAAA 
AGCAAAGCCAACATGTATCTGTTCCTGTGAATGCCAAAGGCACCAGTGATGAACTTA 
CTTATATTCGTGCGATTGCTAAGTTAATTAGTCATGGCGTGAATTTGAATTTAGATA 
GCTTGTTTAACGGGTCAATCCTGGTTAAAGCAGGCCATATAGCAAACACGAACAAAT 
AGTCAACATCGATATCTAGCGCTGGTGAGTTATACCTCATTAGTTGAAATATGGATT 
TAAAGAGAGTAATTATGGAAAATATTGCAGTAGTAGGTATTGCTAATTTGTTCCCGG 
GCTCACAAGCACCGGATCAATTTTGGCAGCAATTGCTTGAACAACAAGAT-TGCCGCA 
GTAAGGCGACCGCTGTTCAAATGGGCGTTGATCCTGCTAAATATACCGCCAACAAAG 
GTGACACAGATAAATTTTACTGTGTGCACGGCGGTTACATCAGTGATTTCAATTTTG 
ATGCTTCAGGTTATCAACTCGATAATGATTATTTAGCCGGTTTAGATGACCTTAATC 
AATGGGGGCTTTATGTTACGAAACAAGCCCTTACCGATGCGGGTTATTGGGGGAGTA 
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CTGCACTAGAAAACTGTGGTGTGATTTTAGGTAATTTGTCATTCCCAACTAAATCAT 
CTAATCAGCTGTTTATGCCTTTGTATCATCAAGTTGTTGATAATGCCTTAAAGGCGG 
TATTACATCCTGATTTTCAATTAACGCATTACACAGCACCGAAAAAAACACATGCTG 
ACAATGCATTAGTAGCAGGTTATCCAGCTGCATTGATCGCGCAAGCGGCGGGTCTTG 
GTGGTTCACATTTTGCACTGGATGCGGCTTGTGCTTCATCTTGTTATAGCGTTAAGT 
TAGCGTGTGATTACCTGCATACGGGTAAAGCCAACATGATGCTTGCTGGTGCGGTAT 
CTGCAGCAGATCCTATGTTCGTAAATATGGGTTTCTCGATATTCCAAGCTTACCCAG 
CTAACAATGTACATGCCCCGTTTGACCAAAATTCACAAGGTCTATTTGCCGGTGAAG 
GCGCGGGCATGATGGTATTGAAACGTCAAAGTGATGCAGTACGTGATGGTGATCATA 
TTTACGCCATTATTAAAGGCGGCGCATTATCGAATGACGGTAAAGGCGAGTTTGTAT 
TAAGCCCGAACACCAAGGGCCAAGTATTAGTATATGAACGTGCTTATGCCGATGCAG 
ATGTTGACCCGAGTACAGTTGACTATATTGAATGTCATGCAACGGGCACACCTAAGG 
GTGACAATGTTGAATTGCGTTCGATGGAAACCTTTTTCAGTCGCGTAAATAACAAAC 
CATTACTGGGCTCGGTTAAATCTAACCTTGGTCATTTGTTAACTGCCGCTGGTATGC 
CTGGCATGACCAAAGCTATGTTAGCGCTAGGTAAAGGTCTTATTCCTGCAACGATTA 
ACTTAAAGCAACCACTGCAATCTAAAAACGGTTACTTTACTGGCGAGCAAATGCCAA 
CGACGACTGTGTCTTGGCCAACAACTCCGGGTGCCAAGGCAGATAAACCGCGTACCG 
CAGGTGTGAGCGTATTTGGTTTTGGTGGCAGCAACGCCCATTTGGTATTACAACAGC 
CAACGCAAACACTCGAGACTAATTTTAGTGTTGCTAAACCACGTGAGCCTTTGGCTA 
TTATTGGTATGGACAGCCATTTTGGTAGTGCCAGTAATTTAGCGCAGTTCAAAACCT 
TATTAAATAATAATCAAAATACCTTCCGTGAATTACCAGAACAACGCTGGAAAGGCA 
TGGAAAGTAACGCTAACGTCATGCAGTCGTTACAATTACGCAAAGCGCCTAAAGGCA 
GTTACGTTGAACAGCTAGATATTGATTTCTTGCGTTTTAAAGTACCGCCTAATGAAA 
AAGATTGCTTGATCCCGCAACAGTTAATGATGATGCAAGTGGCAGACAATGCTGCGA 
AAGACGGAGGTCTAGTTGAAGGTCGTAATGTTGCGGTATTAGTAGCGATGGGCATGG 
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AACTGGAATTACATCAGTATCGTGGTCGCGTTAATCTiU^CCACCCAAATTGAAGACA 

GCTTATTACAGCAAGGTATTAACCTGACTGTTGAGCAACGTGAA6AACTGACCAATA 

TTGCTAAAGACGGTGTTGCCTCGGCTGCACAGCTAAATCAGTATACGAGTTTCATTG 

GTAATATTATGGCGTCACGTATTTCGGCGTTATGGGATTTTTCTGGTCCTGCTATTA 

CCGTATCGGCTGAAGAAAACTCTGTTTATCGTTGTGTTGAATTAGCTGAAAATCTAT 

TTCAAACCAGTGATGTTGAAGCCGTTATTATTGCTGCTGTTGATTTGTCTGGTTCAA 

TTGAAAACATTACTTTACGTCAGCACTACGGTCCAGTTAATGAAAAGGGATCTGTAA 

GTGAATGTGGTCCGGTTAATGAAAGCAGTTCAGTAACCAACAATATTCTTGATCAGC 

AACAATGGCTGGTGGGTGAAGGCGCAGCGGCTATTGTCGTTAAACCGTCATCGCAAG 

TCACTGCTGAGCAAGTTTATGCGCGTATTGATGCGGTGAGTTTTGCCCCTGGTAGCA 

ATGCGAAAGCAATTACGATTGCAGCGGATAAAGCATTAACACTTGCTGGTATCAGTG 

CTGCTGATGTAGCTAGTGTTGAAGCACATGCAAGTGGTTTTAGTGCCGAAAATAATG 

CtGAAAAAACCGCGTTACCGACTTTATACCCAAGCGCAAGTATCAGTTCGGTGAAAG 

CCAATATTGGTCATACGTTTAATGCCTCGGGTATGGCGAGTATTATTAAAACGGCGC 

TGCTGTTAGATCAGAATACGAGTCAAGATCAGAAAAGCAAACATATTGCTATTAACG 

GTCTAGGTCGTGATAACAGCTGCGCGCATCTTATCTTATCGAGTTCAGCGCAAGCGC 

ATCAAGTTGCACCAGCGCCTGTATCTGGTATGGCCAAGCAACGCCCACAGTTAGTTA 

AAACCATCAAACTCGGTGGTCAGTTAATTAGCAACGCGATTGTTAACAGTGCGAGTT 

CATCTTTACACGCTATTAAAGCGCAGTTTGCCGGTAAGCACTTAAACAAAGTTAACC 

AGCCAGTGATGATGGATAACCTGAAGCCCCAAGGTATTAGCGCTCATGCAACCAATG 

AGTATGTGGTGACTGGAGCTGCTAACACTCAAGCTTCTAACATTCAAGCATCTCATG 

TTCAAGCGTCAAGTCATGCACAAGAGATAGCACCAAACCAAGTTCAAAATATGCAAG 

CTACAGCAGCCGCTGTAAGTTCACCCCTTTCTCAACATCAACACACAGCGCAGCCCG 

TAGCGGCACCGAGCGTTGTTGGAGTGACTGTGAAACATAAAGCAAGTAACCAAATTC 

ATCAGCAAGCGTCTACGCATAAAGCATTTTTAGAAAGTCGTTTAGCTGCACAGAAAA 
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ACCTATCGCAACTTGTTGAATTGCAAACCAAGCTGTCAATCCAAACTGGTAGTGACA 

ATACATCTAACAATACTGCGTCAACAAGCAATACAGTGCTAACAAATCCTGTATCAG 

CAACGCCATTAACACTTGTGTCTAATGCGCCTGTAGTAGCGACAAACCTAACCAGTA 

CAGAAGCAAAAGCGCAAGCAGCTGCTACACAAGCTGGTTTTCAGATAAAAGGACCTG 

TTGGTTACAACTATCCACCGCTGCAGTTAATTGAACGTTATAATAAACCAGAAAACG 

TGATTTACGATCAAGCTGATTTGGTTGAATTCGCTGAAGGTGATATTGGTAAGGTAT 

TTGGTGCTGAATACAATATTATTGATGGCTATTCGCGTCGTGTACGTCTGCCAACCT 

CAGATTACTTGTTAGTAACACGTGTTACTGAACTTGATGCCAAGGTGCATGAATACA 

AGAAATCATACATGTGTACTGAATATGATGTGCCTGTTGATGCACCGTTCTTAATTG 

ATGGTCAGATCCCTTGGTCTGTTGCCGTCGAATCAGGCCAGTGTGATTTGATGTTGA 

TTTCATATATCGGTATTGATTTCCAAGCGAAAGGCGAACGTGTTTACCGTTTACTTG 

ATTGTGAATTAACTTTCCTTGAAGAGATGGCITTTGGTGGCGATACTTTACGTTACG 

AGATCCACATTGATTCGTATGCACGTAACGGCGAGCAATTATTATTCTTCTTCCATT 

ACGATTGTTACGTAGGGGATAAGAAGGTACTTATCATGCGTAATGGTTGTGCTGGTT 

TCTTTACTGACGAAGAACTTTCTGATGGTAAAGGCGTTATTCATAACGACAAAGACA 

AAGCTGAGTTTAGCAATGCTGTTAAATCATCATTCACGCCGTTATTACAACATAACC 

GTGGTCAATACGATTATAACGACATGATGAAGTTGGTTAATGGTGATGTTGCCAGTT 

GTTTTGGTCCGCAATATGATCAAGGTGGCCGTAATCCATCATTGAAATTCTCGTCTG 

AGAAGTTCTTGATGATTGAACGTATTACCAAGATAGACCCAACCGGTGGTCATTGGG 

GACTAGGCCTGTTAGAAGGTCAGAAAGATTTAGACCCTGAGCATTGGTATTTCeCTT 

GTCACTTTAAAGGTGATCAAGTAATGGCTGGTTCGTTGATGTCGGAAGGTTGTGGCC 

AAATGGCGATGTTCTTCATGCTGTCTCTTGGTATGCATACCAATGTGAACAACGCTC 

GTTTCCAACCACTACCAGGTGAATCACAAACGGTACGTTGTCGTGGGCAAGTACTGC 

CACAGCGCAATACCTTAACTTACCGTATGGAAGTTACTGCGATGGGTATGCATCCAC 

AGCCATTCATGAAAGCTAATATTGATATTTTGCTTGACGGTAAAGTGGTTGTTGATT 
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TCAAAAACTTGAGCGTGATGATCAGCGAACAAGATGAGCATTCAGATTACCCTGTAA 

CACTGCCGAGTAATGTGGCGCTTAAAGCGATTACTGCACCTGTTGCGTCAGTAGCAC 

CAGCATCTTCACCCGCTAACAGCGCGGATCTAGACGAACGTGGTGTTGAACCGTTTA 

AGTTTCCTGAACGTCCGTTAATGCGTGTTGAGTCAGACTTGTCTGCACCGAAAAGCA 

AAGGTGTGACACCGATTAAGCATTTTGAAGCGCCTGCTGTTGCTGGTCATCATAGAG 

TGCCTAACCAAGCACCGTTTACACCTTGGCATATGTTTGAGTTTGCGACGGGTAATA 

TTTCTAACTGTTTCGGTCCTGATTTTGATGTTTATGAAGGTCGTATTCCACCTCGTA 

CACCTTGTGGCGATTTACAAGTTGTTACTCAGGTTGTAGAAGTGCAGGGCGAACGTC 

TTGATCTTAAAAATCCATCAAGCTGTGTAGCTGAATACTATGTACCGGAAGACGCTT 

GGTACTTTACTAAAAACAGCCATGAAAACTGGATGCCTTATTCATTAATCATGGAAA 

TTGCATTGCAACCAAATGGCTTTATTTCTGGTTACATGGGCACGACGCTTAAATACC 

CTGAAAAAGATCTGTTCTTCCGTAACCTTGATGGTAGCGGCACGTTATTAAAGCAGA 

tTGATTTACGCGGCAAGACCATTGTGAATAAATCAGTCTTGGTTAGTACGGCTATTG 

CTGGTGGCGCGATTATTCAAAGTTTCACGTTTGATATGTCTGTAGATGGCGAGCTAT 

TTTATACTGGTAAAGCTGTATTTGGTTACTTTAGTGGTGAATCACTGACTAACCAAC 

TGGGCATTGATAACGGTAAAACGACTAATGCGTGGTTTGTTGATAACAATACCCCCG 

CAGCGAATATTGATGTGTTTGATTTAACTAATCAGTCATTGGCTCTGTATAAAGCGC 

CTGTGGATAAACCGCATTATAAATTGGCTGGTGGTCAGATGAACTTTATCGATACAG 

TGTCAGTGGTTGAAGGCGGTGGTAAAGCGGGCGTGGCTTATGTTTATGGCGAACGTA 

CGATTGATGCTGATGATTGGTTCTTCCGTTATCACTTCCACCAAGATCCGGTGATGC 

CAGGTTCATTAGGTGTTGAAGCTATTATTGAGTTGATGCAGACCTATGCGCTTAAAA 

ATGAmGGGTGGCAAGTTTGCTAACCCACGTTTCATTGCGCCGATGACGCAAGTTG 

ATTGGAAATACCGTGGGCAAATTACGCCGCTGAATAAACAGATGTCACTGGACGTGC 

atatcactg; 

TGT 
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AAGCGTAAAGGGTCAAGTGTAACGTGCTTAAGCGCCGCATTGGTTAAAGACGCTTTG 
CACGCCGTGAATCCGTCCATGGAGGCTTGGGGTTGGCATCCATGCCAACAACAGCAA 
GCTTACTTTAATCAATACGGCTTGGTGTCCATTTAGACGCCTCGAACTTAGTAGTTA 
ATAGACAAAATAATTTAGCTGTGGAATGAATATAGTAAGTAArCATTCGGCAGCTAC 
AAAAAAGGAATTAAGAATGTCGAGTTTAGGTTTTAACAATAACAACGCAATTAACTG 
GGCTTGGAAAGTAGATCCAGCGTCAGTTCATACACAAGATGCAGAAATTAAAGCAGC 
TTTAATGGATCTAACTAAACCTCTCTATGTGGCGAATAATTCAGGCGTAACTGGTAT 
AGCTAATCATACGTCAGTAGCAGGTGCGATCAGCAATAACATCGATGTTGATGTATT 
GGCGTTTGCGCAAAAGTTAAACCCAGAAGATCTGGGTGATGATGCTTACAAGAAACA 
GCACGGCGTTAAATATGCTTATCATGGCGGTGCGATGGCAAATGGTATTGCCTCGGT 
TGAATTGGTTGTTGCGTTAGGTAAAGCAGGGCTGTTATGTTCATTTGGTGCTGCAGG 
TCTAGTGCCTGATGCGGTTGAAGATGCAATTCGTCGTATTCAAGCTGAATTACCAAA 
TSGCCCTTATGCGGTTAACTTGATCCATGCACCAGCAGAAGAAGCATTAGAGCGTGG 

cgcggttgaacgtttcctaaaacttggcgtcaagacggtagaggcttcagcttacct 
tggtttaactgaacacattgtttggtatcgtgctgctggtctaactaaaaacgcaga 
tggcagtgttaatatcggtaacaaggttatcgctaaagtatcgcgtaccgaagttgg 
tcgccgctttatggaacctgcaccgcaaaaattactggataagttattagaacaaaa 

TAAGATCACCCCTGAACAAGCTGCTTTAGCGTTGCTTGTACCTATGGCTGATGATAT 

tactggggaagcggattctggtggtcatacagataaccgtccgtttttaacattatt 
accgacgattattggtctgcgtgatgaagtgcaagcgaagtataacttctctcctgc 
attacgtgttggtgctggtggtggtatcggaacgcctgaagcagcactcgctgcatt 
taacatgggcgcggcttatatcgttctgggttctgtgaatcaggcgtgtgttgaagc 
gggtgcatctgaatatactcgtaaactgttatcgacagttgaaatggctgatgtgac 
tatggcacctgctgcagatatgtttgaaatgggtgtgaagctgcaagtattaaaacg 
cggttctatgttcgcgatgcgtgcgaagaaactgtatgacttgtatgtggcttatga 
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CTCGATTGAAGATATCCCAGCTGCTGAACGTGAGAAGATTGAAAAACAAATCTTCCG 
TGCAAACCTAGACGAGATTTGGGATGGCACTATCGCTTTCTTTACTGAACGCGATCC 
AGAAATGCTAGCCCGTGCAACGAGTAGTCCTAAACGTAAAATGGCACTTATCTTCCG 
TTGGTATCTTGGCCTTTCTTCACGCTGGTCAAACACAGGCGAGAAGGGACGTGAAAT 
GGATTATCAGATTTGGGCAGGCCCAAGTTTAGGTGCATTCAACAGCTGGGTGAAAGG 
TTCTTACCTTGAAGACTATACCCGCCGTGGCGCTGTAGATGTTGCTTTGCATATGCT 
TAAAGGTGCTGCGTATTTACAACGTGTAAACCAGTTGAAATTGCAAGGTGTTAGCTT 
AAGTACAGAATTGGCAAGTTATCGTACGAGTGATTAATGTTACTTGATGATATGTGA 
ATTAATTAAAGCGCCTGAGGGCGCTTTTTTTGGTTTTTAACTCAGGTGTTGTAACTC 

gaaattgcccctttcaagttagatcgattactcactcacaatatgttgatatcgcac 
ttgccatatacttgctcatccaaagccctatattgataatggtgttaatagtcttta 
atatccgagtctttcttcagcataatactaatatagagactcgaccaatgttaaaca 
cIacaaagaatatattcttgtgtactgccttattattaacgagtgcgagtacgacag 
ctactacgctaaacaattcgatatcagcaattgaacaacgtatttctggtcgtatcg 
gtgtggctgttttagatacgcaaaataaacaaacgtgggcttacaatggtgatgcac 
attttccgatgatgagtacattcaaaaccctcgcttgcgcgaaaatgctaagtgaat 
cgacaaatggtaatctggatcccagtactagctcattgataaaggctgaagaattaa 
tcccttggtcaccagtcactaaaacgtttgtgaataacactattacagtggcgaaag 
cgtgtgaagcaacaatgctgaccagtgataataccgcggctaatattgttttacagt 
atatcggaggccctcaaggcgttactgcattcttgcgagaaattggtgatgaagaga 
gtcagttagatcgtatagaacctgaattgaatgaagctaaggtcggagacttgcgtg 
ataccacgacaccgaaagccatagttaccacgctcaacaaactactacttggtgatg 
ttctacttgatttggataaaaaccaacttaaaacatggatgcaaaataataaagtgt 
cagatcctttactgcgttctatattaccgcaaggctggtttattgccgaccgctcag 
gtgcgggtggtaatggttctcgaggtataactgctatgctttggcactccgagcgtc 
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w.ccgctaatcatcaotatttattta;.ccgaaactgagttagcaatggcaatgcgca 

ATGAGATTATTGTTGAGATCGGTAAGCTGATATTCMAGAATACGCGGTGAAATAAT 

AAGrrATTTTTTGATAATACTrTAACGAGCGTAGCTATCGAAGTGAGGGCGTCAATT 

AGACACCTTrGCTTCCCCTACAAAATCrAATGTGTATTACCTCGGCTAGTACAATTG 

CCCTAAGTTATTTCTGTCCAGCTTTGGCTTAGTGCAATTGCGTTAGCCAATGTGAAC 

ACCAAGGGACTTTGTCGTACCATAACTACCAAGCGACTTTGTCGTTrTTATCTTTTC 

TTAGACAAACAGAGGTTAAAIGAGTGACGCCTTCCAAATCACAGGAATGAATCCGCA 

TrrCAATAAAATCTAACCCGTACCAACTCCGTACAAGTTGATCTTTAGTTGTTTAAA 

ATCTATAATAAATTCAATTACGGAATTAATCCGTACAACTGGAGGTTTTATGGCTAC 

TGCAAGACrrOATATCCGTTTGGATGAAGAAATCAAAGCTAAGGCTGAGAAAGCATC 

AGCrrTACTCGGCTTAAAAAGTTTAACCGAATACGTTGTTCGCrrAATGGACGAAGA 

TTCAACTAAAGTAGTTTCTGAGCATGAGAGTATTACCGTTGAAGCGAATGTATTCGA 

CCAATTTATGGCTGCTTGTGATGAAGCGAAAGCCCCAAATAAAGCATTAGTTGAAGC 

CGCTGTArrrACTCAGAATGGTGAOTTTAAGTGAGTTATrCCAAACGTTTCAAAGAA 

CTGGATAAATCAAAACATGACAGAGCATCAWTGACTGTGGCGAAAAAGAGCTAAAT 

CATTTTATCCAAACTCAAGCAGCCAAACATATGCAAGCAGGTATTAGCCGCACTCTG 

OTTrrACCTGCTTCTGCCCCGTTACCAAACAAAAAATATCCAATTTGCTCATTTTAT 

^GTATCGCGCCAAGCTCAATTAGCCGCOATACGTTACCACAAGCAATGGCTAAAAAG 

TTACCACGTTATCCTATCCCrGTTTTTCTTTrGGCTCAACTTGCCGTCCATAAAGAG 

TTTCATGGGAGTGGGTTAGGCAAACTTAGCTTAATTAAAGCGTTAGAGTACCTTTGG 

CAAATTAACTCTCACATGAGAGCTTACGCCATCGTTGTTGATTGTTTAACTGAACAA 

GCTGAGTCATTCTACGCTAAATATGGTTTCGACGTTCTCTGCGAAATAAATGGTCGA 

GTAAGAATGTTCATATCAATGAAAACAGTCAATCAGTTATTCACTTAACAGTAAGAG 

TTAGTATAACAGTTGTATGAATTAAATTTATTATATTCGGTAATCrCATTGCGATCA 

CGCTAGAAGTGCGAGCGGGTCAGACCGAGGCCACAATAGCAGCCGTTACGTTTAGGG 
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GATGACTTAAAAAGATAACTACTACGTCAGTGGCGATCCTAGAGGATTAAAGGTTTA 

TGATTCACAACATTTATTTATTGTGCTTAATTTTTTCTATCCAATATGCGCAAGCTG 

TAAATATCACTGAAGTAGACTTTTATGTCAGTGATGATATCCCTAAAGATGTTGCCA 

AATTAAAGATAGGTGAATCCATAACGAACTCCAGCCTTATTCTAAGTAACTCATCTA 

TTCCACTCTCGCGGGAGACGGGTAACATATATTACTCTTCATCAATTGCTAACTTGA 

ACTATGACTCGATAGAATTTGTTATGGCTCAATTGATGGCCGAAGATTCCAGCCTTT 

ACAAGATGCTGGTAAATAGCGATAGGTTGTCCGTGCTAGTAATGACATCTTCCCAGT 

CCACAGATCTCTATGGCTCGACTTACTCGGCTTATTTTCCTAATGTTGCGGTCATCG 

ATTTGAATTGTGACTCGCTAACTTTAGAACATGAGCTCGGCCATCTATACGGAGCTG 

AACATGAAGAAATATATGACGACTATGTCTTCTATGCTGCGATATGTGGAGACTATA 

CGACTATCATGAACTCTATGCAGCCTGAAATGAAAGAAAAACAAATGATAAAGGCAT 

ATTCATTCCCTGAATTAAAAGTGGATGGCTTGCAGTGCGGAAATGAAAATACGAATA 

ACAAAAAGGTTATTTTAGACAATATTGGTCGGTTTAGATAGGATTGGGATATTATTC 

TCATTCGGCTCTACTTAGTGCTGTTATTATGAGTGCCAGTGCTTCTATCTACGATAT 

TGGTCTTAACAAGTATTTATCTATAGACGCTAAGGTGTTATGTATTTAAGGGATGTT 

CAAGATGAAACTAGGTGTAAACGATGTATAGTTGTATAACATTTTTTCAACGGTTGG 

AACGTTCGATTCTATCGGGTAACAAGACCGCGACGATCCGCGATAAGTCCGATAGTC 

ATTACTTAGTTGGTCAGATGTTAGATGCTTGTACTCACGAAGATAATCGGAAAATGT 

GTCAAATAGAAATACTGAGCATTGAATATGTGACGTTTAGTGAATTAAACCGTGCGC 

ACGCCAATGCTGAAGGTTTACCGTTTTTGTTTATGCTTAAGTGGATAGTTCGAAAGA 

TTTATCCGACTTCAAATGATTTATTTTTCATAAGTTTCAGAGTTGTAACTATCGATA 

TCTTATAAGTCTTAGTGCACAAAACAGAACTATTTATAGCGCTCAAGAAGGCGATAA 

TTTGATAATGAATTATCGCCTTGTTACTATTAAGAGACTTTAAATGACTGAGATATA 

AGATATGACACGGAAGAACATATTGATCACAGGCGCAAGTTCAGGGTTGGGCCGAGG 

TATGGCqVTCGAATTTGCAAAATCAGGTCATAACTTAGCACTTTGTGCACGTAGACT 
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TGATAATTTAGTTGCACTGAAAGCAGAACTCTTAGCCCTCAATCCTCACATCCAAAT 
CGAAATAAAACCTCTTGATGTCAATGAACATGAACAAGTCTTCACTGTTTTCCATGA 
ATTCAAAGCTGAATTTGGTACGCTTGATCGTATTATTGTTAATGCTGGATTAGGCAA 

GGGTGGATCC 

40138 
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1 

AAATGCAATTAATTATGGCGTAA^iTAGAGTGAAAACATGGCTAATATTCACTAAGTC 

CTGAATTTTATATAAAGTTTAATCTGTTATTTTAGCGTTTACCTGGTCTTATCAGTG 

AGGTTTATAGCCATTATTAGTGGGATTGAAGTGATTTTTAAAGCTATGTATATTATT 

GCAAATATAAATTGTAACAATTAAGACTTTGGACACTTGAGTTCAATTTCGAATTGA 

TTGGCATAAAATTTAAAACAGCTAAATCTACCTCAATCATTTTAGCAAATGTATGCA 

GGTAGATTTTTTTCGCCATTTAAGAGTACACTTGTACGCTAGGTTTTTGTTTAGTGT 

GCAAATGAACGTTTTGATGAGCATTGTTTTTAGAGCACAAAATAGATCCTTACAGGA 

GCAATAACGCAATGGCTAAAAAGAACACCACATCGATTAAGCACGCCAAGGATGTGT 

TAAGTAGTGATGATCAACAGTTAAATTCTCGCTTGCAAGAATGTCCGATTGCCATCA 

TTGGTATGGCATCGGTTTTTGCAGATGCTAAAAACTTGGATCAATTCTGGGATAACA 

TCGTTGACTCTGTGGACGCTATTATTGATGTGCCTA5CGATCGCTGGAACATTGACG 

ACCATTACTCGGCTGATAAAAAAGCAGCTGACAAGACATACTGCAAACGCGGTGGTT 

TCATTCCAGAGCTTGATTTTGATCCGATGGAGTTTGGTTTACCGCCAAATATCCTCG 

AGTTAACTGACATCGCTCAATTGTTGTCATTAATTGTTGCTCGTGATGTATTAAGTG 

ATGCTGGCATTGGTAGTGATTATGACCATGATAAAATTGGTATCACGCTGGGTGTCG 

GTGGTGGTCAGAAACAAATTTCGCCATTAACGTCGCGCCTACAAGGCCCGGTATTAG 

AAAAAGTATTAAAAGCCTCAGGCATTGATGAAGATGATCGCGCTATGATCATCGACA 

AATTTAAAAAAGCCTACATCGGCTGGGAAGAGAACTCATTCCCAGGCATGCTAGGTA 

ACGTTATTGCTGGTCGTATCGCCAATCGTTTTGATTTTGGTGGTACTAACTGTGTGG 

TTGATGCGGCATGCGCTGGCTCCCTTGCAGCTGTTAAAATGGCGATCTCAGACTTAC 

TTGAATATCGTTCAGAAGTCATGATATCGGGTGGTGTATGTTGTGATAACTCGCCAT. 

TCATGTATATGTCATTCTCGAAAACACCAGCATTTACCACCAATGATGATATCCGTC 

CGTTTGATGACGATTCAAAAGGCATGCTGGTTGGTGAAGGTATTGGCATGATGGCGT 

TTAAACGTCTTGAAGATGCTGAACGTGACGGCGACAAAATTTATTCTGTACTGAAAG 

GTATCGGTACATCTTCAGATGGTCGTTTCAAATCTATTTACGCTCCACGCCCAGATG 

GCCAAGCAAAAGCGCTAAAACGTGCTTATGAAGATGCCGGTTTTGCCCCTGAAACAT 

GTGGTCTAATTGAAGGCCATGGTACGGGTACCAAAGCGGGTGATGCCGCAGAATTTG 

FIG. 6-1 



wo 00/42195 



PCTAJSOO/00956 



88/134 

CrGGCTTGACCAAACACITTaGCGCCGCCAGTGATGAAAAGCAATATATCGCCTTAG 

GCTCAGrrAAATCGCAAATTGGTCATACTAAATCTGCGGCTGGCTCTGCGGGTATOA 

rrAAGGCGGCATTAGCGCTGCATCATAAAATCTTACCTGCAACOATCCATATCGATA 

AACCAAGTGAAGCCTTGGATATCAAAAACAGCCCGTTATACCTAAACAGCGAAACGC 

GTCCTTGGATGCCACGTGAAGATGGTATTCCACGTCCTGCAGGTATCAGCTCATTTG 

GTTTTGGCGGCACCAACTTCCATArrATTTTAGAAGAGTATCGCCCAGGTCACGATA 

GCGCATATCGCTTAAACTCAGTGAGCCAAACTGT^ATCTCGGCAAACGACCAAC 

AAGGTATTGTrGCTGAGTTAAATAACTGGCGTACTAAACTGGCrGTCGATGCTGATC 

;,TCAAGGGTTTGTATTTAATGAGTTAGTGACAACGTGGCCATTAAAAACCCCATCCG 

TTAACCAAGCrCGTTTAGGTTTTGTIGCGCGTAATGCAAATGAAOCGATCGCGATGA 

TTGATACGGCATTGAAACAATTCAATGCGAACGCAGATAAAATGACATGGTCAGTAC 

CTACCGGGGTTTACTATCGTCAAGCCGGTATTGATGCAACAGOrAAAGTGGTTGCGC 

TATrCTCAGGGCAAGGTTCGCAATACGTGAACATGGGTCGTGAATTAACCrGTAACT 

TCCCAAGCATGATGCACAGTGCTGCGGCGATGGATAAAGAGTTCAGTGCCGCTGGTT 

TAGGCCAGTTATCTGCAGTTACTTTCCCTATCCCTGTTTATACGGATGCCGAGCGTA 

AGCTACAAGAAGAGCAATTACGTTTAACGCAACATGCGCAACCAGCGATTGGTAGTT 

TGAGTGTTGGTCTGTTCAAAACGTTTAAGCAAGCAGGTTrrAAAGCTGATT^GCTG 

CCGGTCATAGTTTCGGTGAGTTAACCGCATTATGGGCTGCCGATGTATTGAGCGAAA 

GCGATTACATGATGTTAGCGCGTAOTCGTGGTCAAGCAATGGCTGCGCCAGAGCAAC 

^^GATTTTGATGCAGGTAAGATGGCCGCTGTTGTTGGTGATCCAAAGCAAGTCGCTG 

TGATCATTGATACCCTTGATGATGTCrCTATTGCTAACTTCAACTCGAATAACCAAG 

TTGTTATTGCTGGTACTACGGAGCAGGTTGCTGTAGCGGTTACAACCTTAGGTAATG 

CTGGTTTCAAAGTTGTGCCACTGCCGGTATCTGCTGCGTTCCATACACCTTTAGTTC 

arCACGCGCAAAAACCAnTGCTAAAGCGGTTGATAGCGCTAAATTTAAAGCGCCAA 

GCATTCCAGTGTTTGCTAATGGCACAGGCTTGGTGCATTCAAGCAAACCGAATGACA 

rrAAGAAAAACCTGAAAAACCACATGCTGGAATCTGTTCATTTCAATCAAGAAATTG 

FIG. 6-2 



wo 00/42195 



PCT/USOO/00956 



89/134 

ACAACATCTATGCTGATGGTGGCCGCGTATTTATCGAATTTGGTCCAAAGAATGTAT 

TAACTAAATTGGTTGAAAACATTCTCACTGAAAAATCTGATGTGACTGCTATCGCGG 

TTAATGCTAATCCTAAACAACCTGCGGACGTACAAATGCGCCAAGCTGCGCTGCAAA 

TGGCAGTGCTTGGTGTCGCATTAGACAATATTGACCCGTACGACGCCGTTAAGCGTC 

CACTTGTTGCGCCGAAAGCATCACCAATGTTGATGAAGTTATCTGCAGCGTCTTATG 

TTAGTCCGAAAACGAAGAAAGCGTTTGCTGATGCATTGACTGATGGCTGGACTGTTA 

AGCAAGCGAAAGCTGTACCTGCTGTTGTGTCACAACCACAAGTGATTGAAAAGATCG 

TTGAAGTTGAAAAGATAGTTGAACGCATTGTCGAAGTAGAGCGTATTGTCGAAGTAG 

AAAAAATCGTCTACGTTAATGCTGACGGTTCGCTTATATCGCAAAATAATCAAGACG 

TTAACAGCGCTGTTGTTAGCAACGTGACTAATAGCTCAGTGACTCATAGCAGTGATG 

CTGACCTTGTTGCCTCTATTGAACGCAGTGTTGGTCAATTTGTTGCACACCAACAGC 

AATTATTAAATGTACATGAACAGTTTATGCAAGGTCCACAAGACTACGCGAAAACAG 

TGCAGAACGTACTTGCTGCGCAGACGAGCAATGAATTACCGGAAAGTTTAGACCGTA 

CATTGTCTATGTATAACGAGTTCCAATCAGAAACGCTACGTGTACATGAAACGTACC 

TGAACAATCAGACGAGCAACATGAACACCATGCTTACTGGTGCTGAAGCTGATGTGC 

TAGCAACCCCAATAACTCAGGTAGTGAATACAGCCGTTGCCACTAGTCACAAGGTAG 

TTGCTCCAGTTATTGCTAATACAGTGACGAATGTTGTATCTAGTGTCAGTAATAACG 

CGGCGGTTGCAGTGCAAACTGTGGCATTAGCGCCTACGCAAGAAATCGCTCCAACAG 

TCGCTACTACGCCAGCACCCGCATTGGTTGCTATCGTGGCTGAACCTGTGATTGTTG 

CGCATGTTGCTACAGAAGTTGCACCAATTACACCATCAGTTACACCAGTTGTCGCAA 

CTCAAGCGGCTATCGATGTAGCAACTATTAACAAAGTAATGTTAGAAGTTGTTGCTG 

ATAAAACCGGTTATCCAACGGATATGCTGGAACTGAGCATGGACATGGAAGCTGACt 

TAGGTATCGACTCAATCAAACGTGTTGAGATATTAGGCGCAGTACAGGAATTGATCC 

CTGACTTACCTGAACTTAATCCTGAAGATCTTGCTGAGCTACGCACGCTTGGTGAGA 

TTGTCGATTACATGAATTCAAAAGCCCAGGCTGTAGCTCCTACAACAGTACCTGTAA 
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CAAGTGCACCTGTTTCGCCTGCATCTGCTGGTATTGATTTAGCCCACATCCAAAACG 

TAATGTTAGAAGTGGTTGCAGACAAAACCGGTTACCCAACAGACATGCTAGAACTGA 

GCATGGATATGGAAGCTGACTTAGGTATTGATTCAATCAAGCGTGTGGAAATCTTAG 

GTGCAGTACAGGAGATCATAACTGATTTACCTGAGCTAAACCCTGAAGATCTTGCTG 

AATTACGCACCCTAGGTGAAATCGTTAGTTACATGCAAAGCAAAGCGCCAGTCGCTG 

AAAGTGCGCCAGTGGCGACGGCTCCTGTAGCAACAAGCTCAGCACCGTCTATCGATT 

TGAACCACATTCAAACAGTGATGATGGATGTAGTTGCAGATAAGACTGGTTATCCAA 

CTGACATGCTAGAACTTGGCATGGACATGGAAGCTGATTTAGGTATCGATTCAATCA 

AACGTGTGGAAATATTAGGCGCAGTGCAGGAGATCATCACTGATTTACCTGAGCTAA 

ACCCAGAAGACCTCGCTGAATTACGCACGCTAGGTGAAATCGTTAGTTACATGCAAA. 

GCAAAGCGCCAGTCGCTGAGAGTGCGCCAGTAGCGACGGCTTCTGTAGCAACAAGCT 

CTGCACCGTCTATCGATTTAAACCATATCCAAACAGTGATGATGGAAGTGGTTGCAG 

ACAAAACCGGTTATCCAGTAGACATGTTAGAACTTGCTATGGACATGGAAGCTGACC 

TAGGTATCGATTCAATCAAGCGTGTAGAAATTTTAGGTGCGGTACAGGAAATCATTA 

CTGACTTACCTGAGCTTAACCCTGAAGATCTTGCTGAACTACGTACATTAGGTGAAA 

TCGTTAGTTACATGCAAAGCAAAGCGCCCGTAGCTGAAGCGCCTGCAGTACCTGTTG 

CAGTAGAAAGTGCACCTACTAGTGTAACAAGCTCAGCACCGTCTATCGATTTAGACC 

ACATCCAAAATGTAATGATGGATGTTGTTGCTGATAAGACTGGTTATCCTGCCAATA 

TGCTTGAATTAGCAATGGACATGGAAGCCGACCTTGGTATTGATTCAATCAAGCGTG 

TTGAAATTCTAGGCGCGGTACAGGAGATCATTACTGATTTACCTGAACTAAACCCAG 

AAGACTTAGCTGAACTACGTACGTTAGAAGAAATTGTAACCTACATGCAAAGCAAGG 

CGAGTGGTGTTACTGTAAATGTAGTGGCTAGCCCTGAAAATAATGCTGTATCAGATG 

CATTTATGCAAAGCAATGTGGCGACTATCACAGCGGCCGCAGAACATAAGGCGGAAT 

TTAAACCGGCGCCGAGCGCAACCGTTGCTATCTCTCGTCTAAGCTCTATCAGTAAAA 

TAAGCCAAGATTGTAAAGGTGCTAACGCCTTAATCGTAGCTGATGGCACTGATAATG 

CTGTGTTACTTGCAGACCACCTATTGCAAACTGGCTGGAATGTAACTGCATTGCAAC 

CAACTTGGGTAGCTGTAACAACGACGAAAGCATTTAATAAGTCAGTGAACCTGGTGA 
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CTTTAAATGGCGTTGATGAAACTGAAATCAACAACATTATTACTGCTAACGCACAAT 

TGGATGCAGTTATCTATCTGCACGCAAGTAGCGAAATTAATGCTATCGAATACCCAC 

AAGCATCTAAGCAAGGCCTGATGTTAGCCTTCTTATTAGCGAAATTGAGTAAAGTAA 

CTCAAGCCGCTAAAGTGCGTGGCGCCTTTATGATTGTTACTCAGCAGGGTGGTTCAT 

TAGGTTTTGATGATATCGATTCTGCTACAAGTCATGATGTGAAAACAGACCTAGTAC 

AAAGCGGCTTAAACGGTTTAGTTAAGACACTGTCTCACGAGTGGGATAACGTATTCT 

GTCGTGCGGTTGATATTGCTTCGTCATTAACGGCTGAACAAGTTGCAAGCCTTGTTA 

GTGATGAACTACTTGATGCTAACACTGTATTAACAGAAGTGGGTTATCAACAAGCTG 

GTAAAGGCCTTGAACGTATCACGTTAACTGGTGTGGCTACTGACAGCTATGCATTAA 

CAGCTGGCAATAACATCGATGCTAACTCGGTATTTTTAGTGAGTGGTGGCGCAAAAG 

GTGTAACTGCACATTGTGTTGCTCGTATAGCTAAAGAATATCAGTCTAAGTTCATCT 

TATTGGGACGTTCAACGTTCTCAAGTGACGAACCGAGCTGGGCAAGTGGTATTACTG 

ATGAAGCGGCGTTAAAGAAAGCAGCGATGCAGTCTTTGATTACAGCAGGTGATAAAC 

CAACACCCGTTAAGATCGTACAGCTAATCAAACCAATCCAAGCTAATCGTGAAATTG 

CGCAAACCTTGTCTGCAATTACCGCTGCTGGTGGCCAAGCTGAATATGTTTCTGCAG 

ATGTAACTAATGCAGCAAGCGTACAAATGGCAGTCGCTCCAGCTATCGCTAAGTTCG 

GTGCAATCACTGGCATCATTCATGGCGCGGGTGTGTTAGCTGACCAATTCATTGAGC 

AAAAAACACTGAGTGATTTTGAGTCTGTTTACAGCACTAAAATTGACGGTTTGTTAT 

CGCTACTATCAGTCACTGAAGCAAGCAACATCAAGCAATTGGTATTGTTCTCGTCAG 

CGGCTGGTTTCTACGGTAACCCCGGCCAGTCTGATTACTCGATTGCCAATGAGATCT 

TAAATAAAACCGCATACCGCTTTAAATCATTGCACCCACAAGCTCAAGTATTGAGCT 

TTAACTGGGGTCCTTGGGACGGTGGCATGGTAACGCCTGAGCTTAAACGTATGTTTG 

ACCAACGTGGTGTTTACATTATTCCACTTGATGCAGGTGCACAGTTATTGCTGAATG 

AACTAGCCGCTAATGATAACCGTTGTCCACAAATCCTCGTGGGTAATGACTTATCTA 

AAGATGCTAGCTCTGATCAAAAGTCTGATGAAAAGAGTACTGCTGTAAAAAAGCCAC 

AAGTTAGTCGTTTATCAGATGCTTTAGTAACTAAAAGTATCAAAGCGACTAACAGTA 
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GCTCTTTATCAAACAAGACTAGTGCTTTATCAGACAGTAGTGCTTTTCAGGTTAACG 
AAAACCACTTTTTAGCTGACCACATGATCAAAGGCAATCAGGTATTACCAACGGTAT 
GCGCGATTGCTTGGATGAGTGATGCAGCAAAAGCGACTTATAGTAACCGAGACTGTG 
CATTGAAGTATGTCGGTTTCGAAGACTATAAATTGTTTAAAGGTGTGGTTTTTGATG 
GCAATGAGGCGGCGGATTACCAAATCCAATTGTCGCCTGTGACAAGGGCGTCAGAAC 
AGGATTCTGAAGTCCGTATTGCCGCAAAGATCTTTAGCCTGAAAAGTGACGGTAAAC 

ctgtgtttcattatgcagcgacaatattgttagcaactcagccacttaatgctgtga 
aggtagaacttccgacattgacagaaagtgttgatagcaacaataaagtaactgatg 
aagcacaagcgttatacagcaatggcaccttgttccacggtgaaagtctgcagggca 
ttaagcagatattaagttgtgacgacaagggcctgcJtattggcttgtcagataaccg 
atgttgcaacagctaagcagggatccttcccgttagctgacaacaatatctttgcca 
atgatttggtttatcaggctatgttggtctgggtgcgcaaacaatttggtttaggta 
gcttaccttcggtgacaacggcttggactgtgtatcgtgaagtggttgtagatgaag 
tattttatctgcaacttaatgttgttgagcatgatctattgggttcacgcggcagta 
aagcccgttgtgatattcaattgattgctgctgatatgcaattacttgccgaagtga 
aatcagcgcaagtcagtgtcagtgacattttgaacgatatgtcatgatcgagtaaat 

AATAACGATAGGCGTCATGGTGAGCATGGCGTCTGCTTTCTTCATTTTTTAACATTA 

acaatattaatagctaaacgcggttgctttaaaccaagtaaacaagtgcttttagct 
attactattccaaacaggatattaaagagaatatgacggaattagctgttattggta 
tggatgctaaatttagcggacaagacaatattgaccgtgtggaacgcgctttctatg 
aaggtgcttatgtaggtaatgttagccgcgttagtaccgaatctaatgttattagca 
atggcgaagaacaagttattactgccatgacagttcttaactctgtcagtctactag 
cgcaaacgaatcagttaaatatagctgatatcgcggtgttgctgattgctgatgtaa 
aaagtgctgatgatcagcttgtagtccaaattgcatcagcaattgaaaaacagtgtg 
cgagttgtgttgttattgctgatttaggccaagcattaaatcaagtagctgatttag 
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TTAATAACCAAGACTGTCCTGTGGCTGTAATTGGCATGAATAACTCGGTTAATTTAT 
CTCGTCATGATCTTGAATCTGTAACTGCAACAATCAGCTTTGATGAAACCTTCAATG 
GTTATAACAATGTAGCTGGGTTCGCGAGTTTACTTATCGCTTCAACTGCGTTTGCCA 
ATGCTAAGCAATGTTATATATACGCCAACATTAAGGGCTTCGCTCAATCGGGCGTAA 
ATGCTCAATTTAACGTTGGAAACATTAGCGATACTGCAAAGACCGCATTGCAGCAAG 
CTAGCATAACTGCAGAGCAGGTTGGTTTGTTAGAAGTGTCAGCAGTCGCTGATTCGG 
CAATCGCATTGTCTGAAAGCCAAGGTTTAATGTCTGCTTATCATCATACGCAAACTT 
TGCATACTGCATTAAGCAGTGCCCGTAGTGTGACTGGTGAAGGCGGGTGTTTTTCAC 
AGGTCGCAGGTTTATTGAAATGTGTAATTGGTTTACATCAACGTTATATTCCGGCGA. 
TTAAAGATTGGCAACAACCGAGTGACAATCAAATGTCACGGTGGCGGAATTCACCAT 
TCTATATGCCTGTAGATGCTCGACCTTGGTTCCCAdATGCTGATGGCTCTGCACACA 
TTGCCGCTTATAGTTGTGTGACTGCTGACAGCTATTGTCATATTCTTTTACAAGAAA. 
ACGTCTTACAAGAACTTGTTTTGAAAGAAACAGTCTTGCAAGATAATGACTTAACTG 
AAAGCAAGCTTCAGACTCTTGAACAAAACAATCCAGTAGCTGATCTGCGCACTAATG 
GTTACTTTGCATCGAGCGAGTTAGCATTAATCATAGTACAAGGTAATGACGAAGCAC 
AATTACGCTGTGAATTAGAAACTATTACAGGGCAGTTAAGTACTACTGGCATAAGTA 
CTATCAGTATTAAACAGATCGCAGCAGACTGTTATGCCCGTAATGATACTAACAAAG 
CCTATAGCGCAGTGCTTATTGCCGAGACTGCTGAAGAGTTAAGCAAAGAAATAACCT 
TGGCGTTTGCTGGTATCGCTAGCGTGTTTAATGAAGATGCTAAAGAATGGAAAACCC 
CGAAGGGCAGTTATTTTACCGCGCAGCCTGCAAATAAACAGGCTGCTAACAGCACAC 
AGAATGGTGTCACCTTCATGTACCCAGGTATTGGTGCTACAtATGTTGGTTTAGGGC 
GTGATCTATTTCATCTATTCCCACAGATTTATCAGCCTGTAGCGGCTTTAGCCGATG 
ACATTGGCGAAAGTCTAAAAGATACTTTACTTAATCCACGCAGTATTAGTCGTCATA 
GCTTTAAAGAACTCAAGCAGTTGGATCTGGACCTGCGCGGTAACTTAGCCAATATCG 
CTGAAGCCGGTGTGGGTTTTGCTTGTGTGTTTACCAAGGTATTTGAAGAAGTCTTTG 
CCGTTAAAGCTGACTTTGCTACAGGTTATAGCATGGGTGAAGTAAGCATGTATGCAG 
CACTAGGCTGCTGGCAGCAACCGGGATTGATGAGTGCTCGCCTTGCACAATCGAATA 
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CCTTTAATCATCAACTTTGCGGCGAGTTAAGAACACTACGTCAGCATTGGGGCATGG 

ATGATGTAGCTAACGGTACGTTCGAGCAGATCTGGGAAACCTATACCATTAAGGCAA 

CGATTGAACAGGTCGAAATTGCCTCTGCAGATGAAGATCGTGTGTATTGCACCATTA 

TCAATACACCTGATAGCTTGTTGTTAGCCGGTTATCCAGAAGCCTGTCAGCGAGTCA 

TTAAGAATTTAGGTGTGCGTGCAATGGCATTGAATATGGCGAACGCAATTCACAGCG 

CGCCAGCTTATGCCGAATACGATCATATGGTTGAGCTATACCATATGGATGTTACTC 

CACGTATTAATACCAAGATGTATTCAAGCTCATGTTATTTACCGATTCCACAACGCA 

GCAAAGCGATTTCCCACAGTATTGCTAAATGTTTGTGTGATGTGGTGGATTTCCCAC 

GTTTGGTTAATACCTTACATGACAAAGGTGCGCGGGTATTCATTGAAATGGGTCCAG 

GTCGTTCGTTATGTAGCTGGGTAGATAAGATCTTAGTTAATGGCGATGGCGATAATA 

AAAAGCAAAGCCAACATGTATCTGTTCCTGTGAATGCCAAAGGCACCAGTGATGAAC 

TTACTTATATTCGTGCGATTGCTAAGTTAATTAGTCATGGCGTGAATTTGAATTTAG 

ATAGCTTGTTTAACGGGTCAATCCTGGTTAAAGCAGGCCATATAGCAAACACGAACA 

AATAGTCAACATCGATATCTAGCGCTGGTGAGTTATACCTCATTAGTTGAAATATGG 

ATTTAAAGAGAGTAATTATGGAAAATATTGCAGTAGTAGGTATTGCTAATTTGTTCC 

CGGGCTCACAAGCACCGGATCAATTTTGGCAGCAATTGCTTGAACAACAAGATTGCC 

GCAGTAAGGCGACCGCTGTTCAAATGGGCGTTGATCCTGCTAAATATACCGCCAACA 

AAGGTGACACAGATAAATTTTACTGTGTGCACGGCGGTTACATCAGTGATTTCAATT 

TTGATGCTTCAGGTTATCAACTCGATAATGATTATTTAGCCGGTTTAGATGACCTTA 

ATCAATGGGGGCTTTATGTTACGAAACAAGCCCTTACCGATGCGGGTTATTGGGGCA 

GTACTGCACTAGAAAACTGTGGTGTGATTTTAGGTAATTTGTCATTCCCAACTAAAT 

CATCTAATCAGCTGTTTATGCCTTTGTATCATCAAGTTGTTGATAATGCCTTAAAGG 

CGGTATTACATCCTGATTTTCAATTAACGCATTACACAGCACCGAAAAAAACACATG 

CTGACAATGCATTAGTAGCAGGTTATCCAGCTGCATTGATCGCGCAAGCGGCGGGTC 

TTGGTGGTTCACATTTTGCACTGGATGCGGCTTGTGCTTCATCTTGTTATAGCGTTA 

AGTTAGCGTGTGATTACCTGCATACGGGTAAAGCCAACATGATGCTTGCTGGTGCGG 

FIG. 6-8 



wo 00/42195 



PCT/USOO/00956 



95/134 

TATCTGCAGCAGATCCTATGTTCGTAAATATGGGTTTCTCGATATTCCAAGCTTACC 

CAGCTAACAATGTACATGCCCCGTTTGACCAAAATTCACAAGGTCTATTTGCCGGTG 

AAGGCGCGGGCATGATGGTATTGAAACGTCAAAGTGATGCAGTACGTGATGGTGATC 

ATATTTACGCCATTATTAAAGGCGGCGCATTATCGAATGACGGTAAAGGCGAGTTTG 

TATTAAGCCCGAACACCAAGGGCa\AGTATTAGTATATGAACGTGCTTATGCCGATG 

CAGATGTTGACCCGAGTACAGTTGACTATATTGAATGTCATGCAACGGGCACACCTA 

AGGGTGACAATGTTGAATTGCGTTCGATGGAAACCTTTTTCAGTCGCGTAAATAACA 

AACCATTACTGGGCTCGGTTAAATCTAACCTTGGTCATTTGTTAACTGCCGCTGGTA 

TGCCTGGCATGACCAAAGCTATGTTAGCGCTAGGTAAAGGTCTTATTCCTGCAACGA 

TTAACTTAAAGCAACCACTGCAATCTAAAAACGGTTACTTTACTGGCGAGCAAATGC 

CAACGACGACTGTGTCTTGGCCAACAACTCCGGGTGCCAAGGCAGATAAACCGCGTA 

CCGCAGGTGTGAGCGTATTTGGTTTTGGTGGCAGCAACGCCCATTTGGTATTACAAC 

AGCCAACGCAAACACTCGAGACTAATTTTAGTGTTGCTAAACCACGTGAGCCTTTGG 

CTATTATTGGTATGGACAGCCATTTTGGTAGTGCCAGTAATTTAGCGCAGTTCAAAA 

CCTTATTAAATAATAATCAAAATACCTTCCGTGAATTACCAGAACAACGCTGGAAAG 

GCATGGAAAGTAACGCTAACGTCATGCAGTCGTTACAATTACGCAAAGCGCCTAAAG 

GCAGTTACGTTGAACAGCTAGATATTGATTTCTTGCGTTTTAAAGTACCGCCTAATG 

AAAAAGATTGCTTGATCCCGCAACAGTTAATGATGATGCAAGTGGCAGACAATGCTG 

CGAAAGACGGAGGTCTAGTTGAAGGTCGTAATGTTGCGGTATTAGTAGCGATGGGCA 

TGGAACTGGAATTACATCAGTATCGTGGTCGCGTTAATCTAACCACCCAAATTGAAG 

ACAGCTTATTACAGCAAGGTATTAACCTGACTGTTGAGCAACGTGAAGAACTGACCA 

ATATTGCTAAAGACGGTGTTGCCTCGGCTGCACAGCTAAATCAGTATACGAGTTTCA 

TTGGTAATATTATGGCGTCACGTATTTCGGCGTTATGGGATTTTTCTGGTCCTGCTA 

TTACCGTATCGGCTGAAGAAAACTCTGTTTATCGTTGTGTTGAATTAGCTGAAAATC 

TATTTCAAACCAGTGATGTTGAAGCCGTTATTATTGCTGCTGTTGATTTGTCTGGTT 

CAATTGAAAACATTACTTTACGTCAGCACTACGGTCCAGTTAATGAAAAGGGATCTG 
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TAAGTGAATGTGGTCCGGTTAATGAAAGCAGTTCAGTAACCAACAATATTCTTGATC 
AGCAACAATGGCTGGTGGGTGAAGGCGCAGCGGCTATTGTCGTTAAACCGTCATCGC 
AAGTCACTGCTGAGCAAGTTTATGCGCGTATTGATGCGGTGAGTTTTGCCCCTGGTA 
GCAATGCGAAAGCAATTACGATTGCAGCGGATAAAGCATTAACACTTGCTGGTATCA 
GTGCTGCTGATGTAGCTAGTGTTGAAGCACATGCAAGTGGTTTTAGTGCCGAAAATA 
ATGCTGAAAAAACCGCGTTACCGACTTTATACCCAAGCGCAAGTATCAGTTCGGTGA 
AAGCCAATATTGGTCATACGTTTAATGCCTCGGGTATGGCGAGTATTATTAAAACGG 
CGCTGCTGTTAGATCAGAATACGAGTCAAGATCAGAAAAGCAAACATATTGCTATTA 
ACGGTCTAGGTCGTGATAACAGCTGCGCGCATCTTATCTTATCGAGTTCAGCGCAAG 
CGCATCAAGTTGCACCAGCGCCTGTATCTGGTATGGCCAAGCAACGCCCACAGTTAG 
TTAAAACCATCAAACTCGGTGGTCAGTTAATTAGCAACGCGATTGTTAACAGTGCGA 
GTTCATCTTTACACGCTATTAAAGCGCAGTTTGCCGGTAAGCACTTAAACAAAGTTA 
ACCAGCCAGTGATGATGGATAACCTGAAGCCCCAAGGTATTAGCGCTCATGCAACCA 
ATGAGTATGTGGTGACTGGAGCTGCTAACACTCAAGCTTCTAACATTCAAGCATCTC. 
ATGTTCAAGCGTCAAGTCATGCACAAGAGATAGCACCAAACCAAGTTCAAAATATGC 
AAGCTACAGCAGCCGCTGTAAGTTCACCCCTTTCTCAACATCAACACACAGCGCAGC 
CCGTAGCGGCACCGAGCGTTGTTGGAGTGACTGTGAAACATAAAGCAAGTAACCAAA 
TTCATCAGCAAGCGTCTACGCATAAAGCATTTTTAGAAAGTCGTTTAGCTGCACAGA 
AAAACCTATCGCAACTTGTTGAATTGCAAACCAAGCTGTCAATCCAAACTGGTAGTG 
ACAATACATCTAACAATACTGCGTCAACAAGCAATACAGTGCTAACAAATCCTGTAT 
CAGCAACGCCATTAACACTTGTGTCTAATGCGCCTGTAGTAGCGACAAACCTAACCA 
GTACAGAAGCAAAAGCGCAAGCAGCTGCTACACAAGCTGGTTTTCAGATAAAAGGAC 
CTGTTGGTTACAACTATCCACCGCTGCAGTTAATTGAACGTTATAATAAACCAGAAA 
ACGTGATTTACGATCAAGCTGATTTGGTTGAATTCGCTGAAGGTGATATTGGTAAGG 
TATTTGGTGCTGAATACAATATTATTGATGGCTATTCGCGTCGTGTACGTCTGCCAA 
CCTCAGATTACTTGTTAGTAACACGTGTTACTGAACTTGATGCCAAGGTGCATGAAT 
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ACAAGAAATCATACATGTGTACTGAATATGATGTGCCTGTTGATGCACCGTTCTTAA 

TTGATGGTCAGATCCCTTGGTCTGTTGCCGTCGAATCAGGCCAGTGTGATTTGATGT 

TGATTTCATATATCGGTATTGATTTCCAAGCGAAAGGCGAACGTGTTTACCGTTTAC 

TTGATTGTGAATTAACTTTCCTTGAAGAGATGGCTTTTGGTGGCGATACTTTACGTT 

ACGAGATCCACATTGATTCGTATGCACGTAACGGCGAGCAATTATTATTCTTCTTCC 

ATTACGATTGTTACGTAGGGGATAAGAAGGTACTTATCATGCGTAATGGTTGTGCTG 

GTTTCTTTACTGACGAAGAACTTTCTGATGGTAAAGGCGTTATTCATAACGACAAAG 

ACAAAGCTGAGTTTAGCAATGCTGTTAAATCATCATTCACGCCGTTATTACAACATA 

ACCGTGGTCAATACGATTATAACGACATGATGAAGTTGGTTAATGGTGATGTTGCCA 

GTTGTTTTGGTCCGCAATATGATCAAGGTGGCCGTAATCCATCATTGAAATTCTCGT 

CTGAGAAGTTCTTGATGATTGAACGTATTACCAAGATAGACCCAACCGGTGGTCATT 

GGGGACTAGGCCTGTTAGAAGGTCAGAAAGATTTAGACCCTGAGCATTGGTATTTCC 

CTTGTCACTTTAAAGGTGATCAAGTAATGGCTGGTTCGTTGATGTCGGAAGGTTGTG 

GGCAAATGGCGATGTTCTTCATGCTGTCTCTTGGTATGCATACCAATGTGAACAACG 

CTCGTTTCCAACCACTACCAGGTGAATCACAAACGGTACGTTGTCGTGGGCAAGTAC 

TGCCACAGCGCAATACCTTAACTTACCGTATGGAAGTTACTGCGATGGGTATGCATC 

CACAGCCATTCATGAAAGCTAATATTGATATTTTGCTTGACGGTAAAGTGGTTGTTG 

ATTTCAAAAACTTGAGCGTGATGATCAGCGAACAAGATGAGCATTCAGATTACCCTG 

TAACACTGCCGAGTAATGTGGCGCTTAAAGCGATTACTGCACCTGTTGCGTCAGTAG 

CACCAGCATCTTCACCCGCTAACAGCGCGGATCTAGACGAACGTGGTGTTGAACCQT 

TTAAGTTTCCTGAACGTCCGTTAATGCGTGTTGAGTCAGACTTGTCTGCACCGAAAA 

GCAAAGGTGTGACACCGATTAAGCATTTTGAAGCGCCTGCTGTTGCTGGTCATCATA 

GAGTGCCTAACCAAGCACCGTTTACACCTTGGCATATGTTTGAGTTTGCGACGGGTA 

ATATTTCTAACTGTTTCGGTCCTGATTTTGATGTTTATGAAGGTCGTATTCCACCTC 

GTACACCTTGTGGCGATTTACAAGTTGTTACTCAGGTTGTAGAAGTGCAGGGCGAAC 

GTCTTGATCTTAAAAATCCATCAAGCTGTGTAGCTGAATACTATGTACCGGAAGACG 
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CTTGGTACTTTACTAAAAACAGCCATGAAAACTGGATGCCTTATTCATTAATCATGG 
AAATTGCATTGCAACCAAATGGCTTTATTTCTGGTTACATGGGCACGACGCTTAAAT 
ACCCTGAAAAAGATCTGTTCTTCCGTAACCTTGATGGTAGCGGCACGTTATTAAAGC 
AGATTGATTTACGCGGCAAGACCATTGTGAATAAATCAGTCTTGGTTAGTACGGCTA 
TTGCTGGTGGCGCGATTATTCAAAGTTTCACGTTTGATATGTCTGTAGATGGCGAGC 
TATTTTATACTGGTAAAGCTGTATTTGGTTACTTTAGTGGTGAATCACTGACTAACC 
AACTGGGCATTGATAACGGTAAAACGACTAATGCGTGGTTTGTTGATAACAATACCC 
CCGCAGCGAATATTGATGTGTTTGATTTAACTAATCAGTCATTGGCTCTGTATAAAG 
CGCCTGTGGATAAACCGCATTATAAATTGGCTGGTGGTCAGATGAACTTTATCGATA 
CAGTGTCAGTGGTTGAAGGCGGTGGTAAAGCGGGCGTGGCTTATGTTTATGGCGAAC 
GTACGATTGATGCTGATGATTGGTTCTTCCGTTATCACTTCCACCAAGATCCGGTGA 
TGCCAGGTTCATTAGGTGTTGAAGCTATTATTGAGTTGATGCAGACCTATGCGCTTA 
AAAATGATTTGGGTGGCAAGTTTGCTAACCeACGTTTCATTGCGCCGATGACGCAAG 
TTGATTGGAAATACCGTGGGCAAATTACGCCGCTGAATAAACAGATGTCACTGGACG 
TGCATATCACTGAGATCGTGAATGACGCTGGTGAAGTGCGAATCGTTGGTGATGCGA 
ATCTGTCTAAAGATGGTCTGCGTATTTATGAAGTTAAAAACATCGTTTTAAGTATTG 
TTGAAGCGTAAAGGGTCAAGTGTAACGTGCTTAAGCGCCGCATTGGTTAAAGACGCT 
TTGCACGCCGTGAATCCGTCCATGGAGGCTTGGGGTTGGCATCCATGCCAACAACAG 
CAAGCTTACTTTAATCAATACGGCTTGGTGTCCATTTAGACGCCTCGAACTTAGTAG 
TTAATAGACAAAATAATTTAGCTGTGGAATGAATATAGTAAGTAATCATTCGGCAGC 
TACAAAAAAGGAATTAAGAATGTCGAGTTTAGGTTTTAACAATAACAACGCAATTAA 
CTGGGCTTGGAAAGTAGATCCAGCGTCAGTTCATACACAAGATGCAGAAATTAAAGC 
AGCTTTAATGGATCTAACTAAACCTCTCTATGTGGCGAATAATTCAGGCGTAACTGG 
TATAGCTAATCATACGTCAGTAGCAGGTGCGATCAGCAATAACATCGATGTTGATGT 
ATTGGCGTTTGCGCAAAAGTTAAACCCAGAAGATCTGGGTGATGATGCTTACAAGAA 
ACAGCACGGCGTTAAATATGCTTATCATGGCGGTGCGATGGCAAATGGTATTGCCTC 
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GGTTGAATTGGTTGTTGCGTTAGGTAAAGCAGGGCTGTTATGTTCATTTGGTGCTGC 

AGGTCTAGTGCCTGATGCGGTTGAAGATGCAATTCGTCGTATtCAAGCTGA^TTACC 

AAATGGCCCTTATGCGGTTAACTTGATCCATGCACCAGCAGAAGAAGCATTAGAGCG 

TGGCGCGGTTGAACGTTTCCTAAAACTTGGCGTCAAGACGGTAGAGGCTTCAGCTTA 

CCTTGGTTTAACTGAACACATTGTTTGGTATCGTGCTGCTGGTCTAACTAAAAACGC 

AGATGGCAGTGTTAATATCGGTAACAAGGTTATCGCTAAAGTATCGCGTACCGAAGT 

TGGTCGCCGCTTTATGGAACCTGCACCGCAAAAATTACTGGATAAGTTATTAGAACA 

AAATAAGATCACCCCTGAACAAGCTGCTTTAGCGTTGCTTGTACCTATGGCTGATGA 

TATTACTGGGGAAGCGGATTCTGGTGGTCATACAGATAACCGTCCGTTTTTAACATT 

ATTACCGACGATTATTGGTCTGCGTGATGAAGTGCAAGCGAAGTATAACTTCTCTCC 

TGCATTACGTGTTGGTGCTGGTGGTGGTATCGGAACGCCTGAAGCAGCACTCGCTGC 

ATTTAACATGGGCGCGGCTTATATCGTTCTGGGTTCTGTGAATCAGGCGTGTGTTGA 

AGCGGGTGGATCTGAATATACTCGTAAACTGTTATCGACAGTTGAAATGGCTGATGr 

GACTATGGCACCTGCTGCAGATATGTTTGAAATGGGTGTGAAGCTGCAAGTATTAAA 

ACGCGGTTCTATGTTCGCGATGCGTGCGAAGAAACTGTATGACTTGTATGTGGCTTA 

TGACTCGATTGAAGATATCCCAGCTGCTGAACGTGAGAAGATTGAAAAACAAATCTT 

CCGTGCAAACCTAGACGAGATTTGGGATGGCACTATCGCTTTCTTTACTGAACGCGA 

TCCAGAAATGCTAGCCCGTGCAACGAGTAGTCCTAAACGTAAAATGGCACTTATCTT 

CCGTTGGTATCTTGGCCTTTCTTCACGCTGGTCAAACACAGGCGAGAAGGGACGTGA 

AATGGATTATCAGATTTGGGCAGGCCCAAGTTTAGGTGCATTCAACAGCTGGGTGAA 

AGGTTCTTACCTTGAAGACTATACCCGCCGTGGCGCTGTAGATGTTGCTTTGCA7AT 

GCTTAAAGGTGCTGCGTATTTACAACGTGTAAACCAGTTGAAATTGCAAGGTGTTAG 

CTTAAGTACAGAATTGGCAAGTTATCGTACGAGTGATTAATGTTACTTGATGATATG 

TGAATTAATTAAAGCGCCTGAGGGCGCTTTTTTTGGTTTTTAACTCAGGTGTTGTAA 
CTCGAAATTGCCCCTTTC 

19227 
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CGCTGCCGCCGCGTCTCGCCGCGCCGCGCCGCGCCGCCGCCGCCGCTCGCGCGCACGCC 

CGCGCGTCTCGCCGCGCCTGCTGTCTCGAACGAGCTTCTCGAGAAGGCCGAGACCGTCG 

TCATGGAGGTCCTCGCCGCCAAGACTGGCTACGAGACTGACATGATCGAGTCCGACATG 

GAGCTCGAGACTGAGCTCGGCATTGACTCCATCAAGCGTGTCGAGATCCTCTCCGAGGT 

TCAGGCCATGCTCAACGTCGAGGCCAAGGACGTCGACGCTCTCAGCCGCACTCGCACTG 

TGGGTGAGGTCGTCAACGCCATGAAGGCTGAGATCGCTGGTGGCTCTGCCCCGGCGCCT 

GCCGCCGCTGCCCCAGGTCCGGCTGCTGCCGCCCCTGCGCCTGCTGTCTCGAGCGAGCT 

TCTCGAGAAGGCCGAGACTGTCGTCATGGAGGTCCTCGCCGCCAAGACTGGCTACGAGA 

CTGACATGATTGAGTCCGACATGGAGCTCGAGACCGAGCTCGGCATTGACTCCATCAAG 

CGTGTCGAGATTCTCTCCGAGGTTCAGGCCATGCTCAACGTCGAGGCCAAGGACGTCGA 

CGCTCTCAGCCGCACTCGCACTGTTGGTGAGGTCGTCGATGCCATGAAGGCTGAGATCG 

CTGGCAGCTCCGCCTCGGCGCCTGCCGCCGCTGCTCCTGCTCCGGCTGCTGCCGCTCCT 

GCGCCCGCTGCCGCCGCCCCTGCTGTCTCGAACGAGCTTCTCGAGAAAGCCGAGACTGT 

CGTCATGGAGGTCCTCGCCGCCAAGACTGGCTACGAGACTGACATGATCGAGTCCGACA 

TGGAGCTCGAGACTGAGCTCGGCATTGACTCCATCAAGCGTGTCGAGATCCTCTCCGAG 

GTTCAGGCCATGCTCAACGTCGAGGCCAAGGACGTCGATGCCCTCAGCCGCACCCGCAC 

TGTTGGCGAGGTTGTCGATGCCATGAAGGCCGAGATCGCTGGTGGCTCTGCCCCGGCGC 

CTGCCGCCGCTGCCCCTGCTCCGGCTGCCGCCGCCCCTGCTGTCTCGAACGAGCTTCTT 

GAGAAGGCCGAGACTGTCGTCATGGAGGTCCTCGCCGCCAAGACTGGCTACGAGACCGA 

CATGATCGAGTCCGACATGGAGCTCGAGACCGAGCTCGGCATTGACTCCATCAAGCGTG 

TCGAGATTCTCTCCGAGGTTCAGGCCATGCTCAACGTCGAGGCCAAGGACGTCGATGCT 

CTCAGCCGCACTCGCACTGTTGGCGAGGTCGTCGATGCCATGAAGGCTGAGATCGCCGG 

CAGCTCCGCCCCGGCGCCTGCCGCCGCTGCTCCTGCTCCGGCTGCTGCCGCTCCTGCGC 

CCGCTGCCGCTGCCCCTGCTGTCTCGAGCGAGCTTCTCGAGAAGGCCGAGACCGTCGTC 

ATGGAGGTCCTCGCCGCCAAGACTGGCTACGAGACTGACATGATTGAGTCCGACATGGA 

GCTCGAGACTGAGCTCGGCATTGACTCCATCAAGCGTGTCGAGATCCTCTCCGAGGTTC 

AGGCCATGCTCAACGTCGAGGCCAAGGACGTCGATGCCCTCAGCCGCACCCGCACTGTT 

GGCGAGGTTGTCGATGCCATGAAGGCCGAGATCGCTGGTGGCTCTGCCCCGGCGCCTGC 

CGCCGCTGCCCCTGCTCCGGCTGCCGCCGCCCCTGCTGTCTCGAACGAGCTTCTTGAGA 

AGGCCGAGACCGTCGTCATGGAGGTCCTCGCCGCCAAGACTGGCTACGAGACCGACATG 

ATCGAGTCCGACATGGAGCTCGAGACCGAGCTCGGCATTGACTCCATCAAGCGTGTCGA 

GATTCTCTCCGAGGTTCAGGCCATGCTCAACGTCGAGGCCAAGGACGTCGACGCTCTCA 

GCCGCACTCGCACTGTTGGCGAGGTCGTCGATGCCATGAAGGCTGAGATCGCTGGTGGC 

TCTGCCCCGGCGCCTGCCGCCGCTGCTCCTGCCTCGGCTGGCGCCGCGCCTGCGGTCAA 

GATTGACTCGGTCCACGGCGCTGACTGTGATGATCTTTCCCTGATGCACGCCAAGGTGG 

TTGACATCCGCCGCCCGGACGAGCTCATCCTGGAGCGCCCCGAGAACCGCCCCGTTCTC 

GTTGTCGATGACGGCAGCGAGCTCACCCTCGCCCTGGTCCGCGTCCTCGGCGCCTGCGC 

CGTTGTCCTGACCTTTGAGGGTCTCCAGCTCGCTCAGCGCGCTGGTGCCGCTGCCATCC 

GCCACGTGCTCGCCAAGGATCTTTCCGCGGAGAGCGCCGAGAAGGCCATCAAGGAGGCC 

GAGCAGCGCTTTGGCGCTCTCGGCGGCTTCATCTCGCAGCAGGCGGAGCGCTTCGAGCC 

CGCCGAAATCCTCGGCTTCACGCTCATGTGCGCCAAGTTCGCCAAGGCTTCCCTCTGCA 

CGGCTGTGGCTGGCGGCCGCCCGGCCTTTATCGGTGTGGCGCGCCTTGACGGCCGCCTC 
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GGATTCACTTCGCAGGGCACTTCTGACGCGCTCAAGCGTGCCCAGCGTGGTGCCATCTT 

?gStctSaagaccatcggcctcgagtggtccgagtctgacgtcttttcccgcggc^ 
tcgaSttgctcagggcatgcaccccgaggatgccgccgtggcgattgtgcgcgagat^ 
??gtgStgacattcgcattcgcgaggtcggcattggcgcaaaccagcagcgctgcac 

ga?Sg?Scgccaagctcgagaccggcaacccgca^ 

?SgctStttctggcggcgctcgcggcatcacgcctctttgcatccgggagatcacg 
SSaStcgcgggcggcaagtacattctgcttggccgcagcaaggtctctgcgag^^^ 
a?cSgg?gcgctggcatcactgacgagaaggctgtgcaaaaggctgc^^^^ 
aScSgcgcgcctttagcgctggcgagggccccaagcccacgccccgcgctgtcact 

^gc^gggctctgttcttggcgctcgcgaggtgcgcagctcta^^ 

tSS?CGG?GSLGGCCATCTACTCGTCGTGC^^^^ 

??SSc?gtgcgcgatgccgagtcccagctcggtgcccgcgtctcgggcatcgttcat 
gc^cgS?g?gctccgcgaccgtctcatc^^^ 

rOTCT^TcSACCAAGGTCACCGGTCTCGAGAACCTCCTCGCCGCCGT^^^^ 

SScIIg?SSgg?cctcttcagctcgctcgccgg^ 
?Sga™gccatggccaacgaggcccttaacaagatgggcctcgagctcgccaagga 

S??tSgtcaagtcgatctgcttcggtccctgggacggtggcatggtgacgccgcagc 

SI^gSgSgttccaggagatgggcgtgcagatcatcccccgcgagggcggcgctgat 

I™?SScGCGLTCGTGCrCGGCTCCTCGCCGGCTGAGATCCTTG^^ 

cSScgtccaagaaggtcggctcggacaccatcaccctgcaccgcaagatttc^^^ 

AGTCcScCCCTTCCTCGAGGACCACGTCATCCAGGGCCGCCGCGTGCTC^^ 
™S^TTGGCTCGCrCGCGGAGACCTGCCTCGGCCTCTTCCCCGGCT^^ 

Sc^StScgacgcccagctcttcaagggtgtcactgtcgacggcgacgtcaactg^^ 

aggSaccctcaccccgtcgacggcgccctcgggccgcgtcaacgtccagg^ 

^SotSccagcggcaagctggtcccggcctaccgcgccgtcatcgtgctct 

^aSgSgcgcccccggccaacgccaccatgcagccgccctcgctcgatgccgatc^^^ 

Sctccagggctccgtctacgacggcaagaccctcttcc^cggcccggcctt^ 

SSatga?gSctctcgtgcaccaagagccagcttgtggc^^^ 

?GGCTC?GACG?ScTCG^^ 

S^JgaStggcctttcaggccatgctcgtctggg^^ 

G?S^TCCCCAACTCGATCCAGCGCATCGTCCAGCACCGCCCGGTCCCGCAGGACAAGCC 

SSaS??aSctccgctccaaccagtcgggcggtcactcccagcacaagcac^^ 
ScaStccacaacgagcagggcgatctcttcattgatgtccaggcttcggtcatcgcc 

acggacagccttgccttctaa 
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TGCCGTCTTTGAGGAGCATGACCCCTCCAACGCCGCCTGCACGGGCCACGACTCCATTT 

CTGCGCTCTCGGCCCGCTGCGGCGGTGAAAGCAACATGCGCATCGCCATCACTGGTATG 

GACGCCACCTTTGGCGCTCTCAAGGGACTCGACGCCTTCGAGCGCGCCATTTACACCGG 

CGCTCACGGTGCCATCCCACTCCCAGAAAAGCGCTGGCGCTTTCTCGGCAAGGACAAGG 

ACTTTCTTGACCTCTGCGGCGTCAAGGCCACCCCGCACGGCTGCTACATTGAAGATGTT 

GAGGTCGACTTCCAGCGCCTCCGCACGCCCATGACCCCTGAAGACATGCTCCTCCCTCA 

GCAGCTTCTGGCCGTCACCACCATTGACCGCGCCATCCTCGACTCGGGAATGAAAAAGG 

GTGGCAATGTCGCCGTCTTTGTCGGCCTCGGCACCGACCTCGAGCTCTACCGTCACCGT 

GCTCGCGTCGCTCTCAAGGAGCGCGTCCGCCCTGAAGCCTCCAAGAAGCTCAATGACAT 

GATGCAGTACATTAACGACTGCGGCACATCCACATCGTACACCTCGTACATTGGGAACC 

TCGTCGCCACGCGCGTCTCGTCGCAGTGGGGCTTCACGGGCCCCTCCTTTACGATCACC 

GAGGGCAACAACTCCGTCTACCGCTGCGCCGAGCTCGGCAAGTACCTCCTCGAGACCGG 

CGAGGTCGATGGCGTCGTCGTTGCGGGTGTCGATCTCTGCGGCAGTGCCGAAAACCTTT 

ACGTCAAGTCTCGCCGCTTCAAGGTGTCCACCTCCGATACCCCGCGCGCCAGCTTTGAC 

GCCGCCGCCGATGGCTACTTTGTCGGCGAGGGCTGCGGTGCCTTTGTGCTCAAGCGTGA 

GACTAGCTGCACCAAGGACGACCGTATCTACGCTTGCATGGATGCCATCGTCCCTGGCA 

ACGTCCCTAGCGCCTGCTTGCGCGAGGCCCTCGACCAGGCGCGCGTCAAGCCGGGCGAT 

ATCGAGATGCTCGAGCTCAGCGCCGACTCCGCCCGCCACCTCAAGGACCCGTCCGTCCT 

GCCCAAGGAGCTCACTGCCGAGGAGGAAATCGGCGGCCTTCAGACGATCCTTCGTGACG 

ATGACAAGCTCCCGCGCAACGTCGCAACGGGCAGTGTCAAGGCCACCGTCGGTGACACC 

GGTTATGCCTCTGGTGCTGCCAGCCTCATCAAGGCTGCGCTTTGCATCTACAACCGCTA 

CCTGeCCAGCAACGGCGACGACTGGGATGAACCCGCCCCTGAGGCGCCCTGGGACAGCA 

CCCTCTTTGCGTGCCAGACCTCGCGCGCTTGGCTCAAGAACCCTGGCGAGCGTCGCTAT 

GCGGCCGTCTCGGGCGTCTCCGAGACGCGCTCGTGCTATTCCGTGCTCCTCTCCGAAGC 

CGAGGGCCACTACGAGCGCGAGAACCGCATCTCGCTCGACGAGGAGGCGCCCAAGCTCA 

TTGTGCTTCGCGCCGACTCCCACGAGGAGATCCTTGGTCGCCTCGACAAGATCCGCGAG 

CGCTTCTTGCAGCCCACGGGCGCCGCCCCGCGCGAGTCCGAGCTCAAGGCGCAGGCCCG 

CCGCATCTTCCTCGAGCTCCTCGGCGAGACCCTTGCCCAGGATGCCGCTTCTTCAGGCT 

CGCAAAAGCCCCTCGCTCTCAGCCTCGTCTCCACGCCCTCCAAGCTCCAGCGCGAGGTC 

GAGCTCGCGGCCAAGGGTATCCCGCGCTGCCTCAAGATGCGCCGCGATTGGAGCTCCCC 

TGCTGGCAGCCGCTACGCGCCTGAGCCGCTCGCCAGCGACCGCGTCGCCTTCATGTACG 

GCGAAGGTCGCAGCCCTTACTACGGCATCACCCAAGACATTCACCGCATTTGGCCCGAA 

CTCCACGAGGTCATCAACGAAAAGACGAACCGTCTCTGGGCCGAAGGCGACCGCTGGGT 

CATGCCGCGCGCCAGCTTCAAGTCGGAGCTCGAGAGCCAGCAGCAAGAGTTTGATCGCA 

ACATGATTGAAATGTTCCGTCTTGGAATCCTCACCTCAATTGCCTTCACCAATCTGGCG 

CGCGACGTTCTCAACATCACGCCCAAGGCCGCCTTTGGCCTCAGTCTTGGCGAGATTTC 

CATGATTTTTGCCTTTTCCAAGAAGAACGGTCTCATCTCCGACCAGCTCACCAAGGATC 

TTCGCGAGTCCGACGTGTGGAACAAGGCTCTGGCCGTTGAATTTAATGCGCTGCGCGAG 

GCCTGGGGCATTCCACAGAGTGTCCCCAAGGACGAGTTCTGGCAAGGCTACATTGTGCG 

CGGCACCAAGCAGGATATCGAGGCGGCCATCGCCCCGGACAGCAAGTACGTGCGCCTCA 

CCATCATCAATGATGCCAACACCGCCCTCATTAGCGGCAAGCCCGACGCCTGCAAGGCT 

GCGATCGCGCGTCTCGGTGGCAACATTCCTGCGCTTCCCGTGACCCAGGGCATGTGCGG 

CCACTGCCCCGAGGTGGGACCTTATACCAAGGATATCGCCAAGATCCATGCCAACCTTG 
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AGTTCCCCGTTGTCGACGGCCTTGACCTCTGGACCACAATCAACCAGAAGCGCCTCGTG 
CCACGCGCCACGGGCGCCAAGGACGAATGGGCCCCTTCTTCCTTTGGCGAGTACGCCGG 
CCAGCTCTACGAGAAGCAGGCTAACTTCCCCCAAATCGTCGAGACCATTTACAAGCAAA 
ACTACGACGTCTTTGTCGAGGTTGGGCCCAACAACCACCGTAGCACCGCAGTGCGCACC 
ACGCTTGGTCCCCAGCGCAACCACCTTGCTGGCGCCATCGACAAGCAGAACX3AGGATGC 
TTGGACGACCATCGTCAAGCTTGTGGCTTCGCTCAAGGCCCACCTTGTTCCTGGCGTCA 
CGATCTCGCCGCTGTACCACTCCAAGCTTGTGGCGGAGGCTCAGGCTTGCTACGCTGCG 
CTCTGCAAGGGTGAAAAGCCCAAGAAGAACAAGTTTGTGCGCAAGATTCAGCTCAACGG 
TCGCTTCAACAGCAAGGCGGACCCCATCTCCTCGGCCGATCTTGCCAGCTTTCCGCCTG 
CGGACCCTGCCATTGAAGCCGCCATCTCGAGCCGCATCATGAAGCCTGTCGCTCCCAAG 
TTCTACGCGCGTCTCAACATTGACGAGCAGGACGAGACCCGAGATCCGATCCTCAACAA 
GGACAACGCGCCGTCTTCTTCTTCTTCTTCtTCTTCTTCTTCTTCTTCTTCTTCTTCTC 
CGTCGCCTGCTCCTTCGGCCCCCGTGCAAAAGAAGGCTGCTCCCGCCGCGGAGACCAAG 
GCTGTTGCTTCGGCTGACGCACTTCGCAGTGCCCTGCTCGATCTCGACAGTATGCTTGC 
GCTGAGCTCTGCCAGTGCCTCCGGCAACCTTGTTGAGACTGCGCCTAGCGACGCCTCGG 
TCATTGTGCCGCCCTGCAACATTGCGGATCTCGGCAGCCGCGCCTTCATGAAAACGTAC 
GGTGTTTCGGCGCCTCTGTACACGGGCGCCATGGCCAAGGGCATTGCCTCTGCGGACCT 
CGTCATTGCCGCCGGCCGCCAGGGCATCCTTGCGTCCTTTGGCGCCGGCGGACTTCCCA 
TGCAGGTTGTGCGTGAGTCCATCGAAAAGATTCAGGCCGCCCTGCCCAATGGCCCGTAC 
GCTGTCAACCTTATCCATTCTCCCTTTGACAGCAACCTCGAAAAGGGCAATGTCGATCT 
CTTCCTCGAGAAGGGTGTCACCTTTGTCGAGGCCTCGGCCTTTATGACGCTCACCCCGC 
AGGTCGTGCGGTACCGCGCGGCTGGCCTCACGCGCAACGCCGACGGCTCGGTCAACATC 
CGCAACCGTATCATTGGCAAGGTCTCGCGCACCGAGCTCGCCGAGATGTTCATGCGTCC 
TGCGQCCGAGCACCTTCTTCAGAAGCTCATTGCTTCCGGCGAGATCAACCAGGAGCAGG 
CCGAGCTCGCCCGCCGTGTTCCCGTCGCTGACGACATCGCGGTCGAAGCTGACTCGGGT 
GGCCACACCGACAACCGCCCCATCCACGTCATTCTGCCCCTCATCATCAACCTTCGCGA 
CCGCCTTCACCGCGAGTGCGGCTACCCGGCCAACCTTCGCGTCCGTGTGGGCGCCGGCG 
GTGGCATTGGGTGCCCCCAGGCGGCGCTGGCCACCTTCAACATGGGTGCCTCCTTTATT 
GTCACCGGCACCGTGAACCAGGTCGCCAAGCAGTCGGGCACGTGCGACAATGTGCGCAA 
GCAGCTCGCGAAGGCCACTTACTCGGACGTATGCATGGCCCCGGCTGCCGACATGTTCG 
AGGAAGGCGTCAAGCTTCAGGTCCTCAAGAAGGGAACCATGTTTCCCTCGCGCGCCAAC 
AAGCTCTACGAGCTCTTTTGCAAGTACGACTCGTTCGAGTCCATGCCCCCCGCAGAGCT 
TGCGCGCGTCGAGAAGCGCATCTTCAGCCGCGCGCTCGAAGAGGTCTGGGACGAGACCA 
AAAACTTTTACATTAACCGTCTTCACAACCCGGAGAAGATCCAGCGCGCCGAGCGCGAC 
CCCAAGCTCAAGATGTCGCTGTGCTTTCGCTGGTACCTGAGCCTGGCGAGCCGCTGGGC 
CAACACTGGAGCTTCCGATCGCGTCATGGACTACCAGGTCTGGTGCGGTCCTGCCATTG 
GTTCCTTCAACGATTTCATCAAGGGAACTTACCTTGATCCGGCCGTCGCAAACGAGTAC 
CCGTGCGTCGTTCAGATTAACAAGCAGATCCTTCGTGGAGCGTGCTTCTTGCGCCGTCT 
CGAAATTCTGCGCAACGCACGCCTTTCCGATGGCGCTGCCGCTCTTGTGGCCAGCATCG 
ATGACACATACGTCCCGGCCGAGAAGCTGTAAGTAAGCTCTCATATATGTTAGTTGCGT 
GAGACCGACACGAAGATAATATCACATACGCTTTTGTTTGTTCTTTCAATTATTTGTCT 
GTGCTTCATGTTGCTCCTCAGTATCTAGCTGGCGGCTCTTATCTTCTTTTAAAATATCT 
GGACAAGGACAAAAACAAGAATAAAGGCGAGAAGATGTGAATTTCATTTCGACTTGAGA 
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ACTCGAAGAGCATTGATGCGGTTAGTATATGGGTATTTTCCAGACACTTTTCATCATCA 

TCATCATCATCATCATTATGAAGAAGTAGTAGCTGATAAAGT^^^ 

CGAGAAAAAAAAAAAAAAAAAAA 
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CGAGCAGAGGCCGGCCGCGAGCCCGAGCCCGCGCCGCAGATCACTAGTACCGCTGCGGA 

a?gaSgcagcagcagcagcagcagcagcagcagcagcagcagcagc 

GAGATAAAGAAAAAGCGGCAGAGACGATGGCGCTCCGTGTCAAGACGAACAAGAAGCCA 

tcctgggagatgaccaaggaggagctgaccagcggcaagaccgaggtgttcaactatga 

rrAACTCCTCGAGTTCGCAGAGGGCGACATCGCCAAGGTCTTCGGACCCGAGTTCGCCG 
TCATCGACAAGTACCCGCGCCGCGTGCGCCTGCCCGCCCGCGAGTACCTGCTCGTGACC 
rrrGTCACCCTCATGGACGCCGAGGTCAACAACTACCGCGTCGGCGCCCGCATGGTCAC 
rrAGTACGATCTCCCCGTCAACGGAGAGCTCTCCGAGGGCGGAGACTGCCCCTGGGCCG 
TCCTGGTCGAGAGTGGCCAGTGCGATCTCATGCTCATCTCCTACATGGGCATTGACTTC 
CAGAACCAGGGCGACCGCGTCTACCGCCTGCTCAACACCACGCTCACCTTTTACGGCGT 
GGCCCACGAGGGCGAGACCCTCGAGTACGACATTCGCGTCACCGGCTTCGCCAAGCGTC 
TCGACGGCGGCATCTCCATGTTCTTCTTCGAGTACGACTGCTACGTCAACGGCCGCCTC 
CTCATCGAGATGCGCGATGGCTGCGCCGGCTTCTTCACCAACGAGGAGCTCGACGCCGG 

caSggcgtcgtcttcacccgcggcgacctcgccgcccgcgccaagatcccaaagcagg 

ACGTCTCCCCCTACGCCGTCGCCCCCTGCCTCCACAAGACCAAGCTCAACGAAAAGGAG 

atgcagaccctcgtcgacaaggactgggcatccgtctttggctccaagaacggcatgcc 

rGAAATCAACTACAAACTCTGCGCGCGTAAGATGCTCATGATTGACCGCGTCACCAGCA 
TTGACCACAAGGGCGGTGTCTACGGCCTCGGTCAGCTCGTCGGTGAAAAGATCCTCGAG 
CGCGACCACTGGTACTTTCCCTGCCACTTTGTCAAGGATCAGGTCATGGCCGGATCCCT 

cgtctccgacggctgcagccagatgctcaagatgtacatgatctggctcggcctccacc 

TCACCACCGGACCCTTTGACTTCCGCCCGGTCAACGGCCACCCCAACAAGGTCCGCTGC 

rGCGGCCAAATCTCCCCGCACAAGGGCAAGCTCGTCTACGTqATGGAGATCAAGGAGAT 

GGGCTTCGACGAGGACAACGACCCGTACGCCATTGCCGACGTCAACATCATTGATGTCG 

ACTTCGAAAAGGGCCAGGACTTTAGCCTCGACCGCATCAGCGACTACGGCAAGGGCGAC 

CTCAACAAGAAGATCGTCGTCGACTTTAAGGGCATCGCTCTCAAGATGCAGAAGCGCTC 

CACCAACAAGAACCCCTCCAAGGTTCAGCCCGTCTTTGCCAACGGCGCCGCCACTGTCG 

GCCCCGAGGCCTCCAAGGCTTCCTCCGGCGCCAGCGCCAGCGCCAGCGCCGCCCCGGCC 

AAGCCTGCCTTCAGCGCCGATGTTCTTGCGCCCAAGCCCGTTGCCCTTCCCGAGCACAT 

CCTCAAGGGCGACGCCCTCGCCCCCAAGGAGATGTCCTGGCACCCCATGGCCCGCATCC 

CGGGCAACCCGACGCCCTCTTTTGCGCCCTCGGCCTACAAGCCGCGCAACATCGCCTTT 

ACGCCCTTCCCCGGCAACCCCAACGATAACGACCACACCCCGGGCAAGATGCCGCTCAC 

CTGGTTGAACATGGCCGAGTTCATGGCCGGCAAGGTCAGCATGTGCCTCGGCCCCGAGT 

TCGCCAAGTTCGACGACTCGAACACCAGCCGCAGCCCCGCTTGGGACCTCGCTCTCGTC 

ACCCGCGCCGTGTCTGTGTCTGACCTCAAGCACGTCAACTACCGCAACATCGACCTCGA 

CCCCTCCAAGGGTfir-raTnGTCGGCGAGTTCG ACTGCCCCGCGGACG CCTGGTTCT^ 

"agggcgcctgcaacgatgcccacatgccgtactcgatcctcatggagatcgccctccag 

ACCTCGGGTGTGCTCACCTCGGTGCTCAAGGCGCCCCTGACCATGGAGAAGGACGACAT 

cctcttccgcaacctcgacgccaacgccgagttcgtgcgcgccgacctcgactaccgcg 

GCAAGACTATCCGCAACGTCACCAAGTGCACTGGCTACAGCATGCTCGGCGAGATGGGC 

gtccaccgcttcacctttgagctctacgtcgatgatgtgctcttttacaagggctcgac 

CTCGTTCGGCTGGTTCGTGCCCGAGGTYTTTGCCGCCCAGGCCGGCCTCGACAACGGCC 

gcaagtcggagccctggttcattgagaacaaggttccggcctcgcaggtctcctccttt 

GACGTGCGCCCCAACGGCAGCGGCCGCACCGCCATCTTCGCCAACGCCCCCAGCGGCGC 
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rcAGCTCAACCGCCGCACGGACCAGGGCCAGTACCTCGACGCCGTCGftCATTGTCTCCG 

gSgSgcaagaagagcctcggctacgcccacggttccaagacggtcaacccgaac^^^ 

?S?TCTTCTCGTGCCACTTTTGGTTTGACTCGGTCATGCCCGGAAGTCTCGGTGTCGA 

ctccatgttccagctcgtcgaggccatcgccgcccacgaggatctcgctggcaaagcac 
ggSttgccaaccccacctttgtgcacgcccccgggcaagatcaagctggaagtaccgc 
ggsSgctcacgcccaagagcaagaagatggactcggaggtccacatcgtgtccgtgg^ 
?gcSaSacggcgttgtcgacctcgtcgccgacggcttcctctgggccgacagcctcc 

r?GTCTACTCGGTGAGCAACATTCGCGTGCGCATCGCCTCCGGTGAGGCCCCTGC^^ 

rrCTCCTCCGCCGCCTCTGTGGGCTCCTCGGCTTCGTCCGTCGAGCGCACGCGCTCGAG 

rrrCGCTGTCGCCTCCGGCCCGGCCCAGAGCATCGACCTCAAGCAGCTCAAGACCGAGC 

TCCTCGAGCTCGATGCCCCGCTCTACCTCTCGCAGGACCCGACCAGCGGCCAGCTCAAG 

AAGCACACCGACGTGGCCTCCGGCCAGGCCACCATCGTGCAGCCCTGCACGCTCGGCGA 

rrTCGGTGACCGCTCCTTCATGGAGACCTACGGCGTCGTCGCCCCGCTGTACACGGGCG 

CCATGGCCAAGGGCATTGCCTCGGCGGACCTCGTCATCGCCGCCGGCAAGCGCAAGATC 

rTCGGCTCCTTTGGCGCCGGCGGCCTCCCCATGCACCACGTGCGCGCCGCCCTCGAGAA 

GATCCAGGCCGCCCTGCCTCAGGGCCCCTACGCCGTCAACCTCATGCACTCGCCTTTTG 

ACAGCAACCTCGAGAAGGGCAACGTCGATCTCTTCCTCGAGAAGGGCGTCACTGTGGTG 

GAGGCCTCGGCATTCATGACCCTCACCCCGCAGGTCGTGCGCTACCGCGCCGCCGGCCT 

CTCGCGCAACGCCGACGGTTCGGTCAACATCCGCAACCGCATCATCGGCAAGGTCTCGC 

GCACCGAGCTCGCCGAGATGTTCATCCGCCCGGCCCCGGAGCACCTCCTCGAGAAGCTC 

S?GcScGGGCGAGATCACCCAGGAGCAGGCCGAGCTCGCGCGCCGCGTTCCCGTCGC 

CGACGATATCGCTGTCGAGGCTGACTCGGGCGGCCACACCGACAACCGCCCCATCCACG 

?CAT?CTCCCGCTCATCATCAACCTCCGCAACCGCCTGCACCGCGAGTGCGGCTACCCC 

GCGCACCTCCGCGTCCGCGTTGGCGCCGGCGGTGGCGTCGGCTGCCCGCAGGCCGCCGC 

?GCCGCGCTCACCATGGGCGCCGCCTTCATCGTCACCGGCACTGTCAACCAGGTCGCCA 

agS^??Sgcacctgcgacaacgtgcgcaagcagctctcgcaggccacct^^^^ 

ATCTGCATGGCCCCGGCCGCCGACATGTTCGAGGAGGGCGTCAAGCTCCAGGTCCTCAA 

SISggaaccatgttcccctcgcgcgccaacaagctctacgagctcttttgcaagtacg 

act??toSctccatgcctcctgccgagctcgagcgcatcgagaagcgtatctt^ 

cgcgcactccaggaggtctgggaggagaccaaggacttttacattaacggtctcaagaa 

cccggagaagatccagcgcgccgagcacgaccccaagctcaagatgtcgctctgcttcc 

gctggtaccttggtcttgccagccgctgggccaacatgggcgccccggaccgcgtcatg 

gactaccaggtctggtgtggcccggccattggcgccttcaacgacttcatcaagggcac 

ctacctcgaccccgctgtctccaacgagtacccctgtgtcgtccagatcaacctgcaaa 

tccScgtggtgcctgctacct5cgccgtctcaacgcc 
gacctcgagaccgaggatgctgcctttgtctacgagcccaccaacgcgctctaagaaag 

tgaaccttgtcctaacccgacagcgaatggcgggagggggcgggctaaaagatcgtatt 
acatagtatttttcccctactctttgtgaaaaaaaaaaaaaaaaaaa 
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RCRRVSPRRAAPPPPLARTPARLAAPAVSNELLEKAETWMEVLAAKTGYETDMIESDM 
ELETELGIDSIKRVEILSEVQAMLNVEAKDVDALSRTRTVGEWNAMKAEIAGGSAPAP 
AAAAPG PAAAAP AP AVS SELLEKAET WMEVLAAKTGYETDM I ES DME LETELG I DS I K 
RVEILSEVQAMLNVEAKDVDALSRTRTVGEWDAMKAEIAGSSASAPAAAAPAPAAAAP 
APAAAAPAVSNELLEKAETWMEVLAAKTGYETDMIESDMELETELGIDSIKRVEILSE 
VQAMLNVEAKDVDALSRTRTVGEWDAMKAEIAGGSAPAPAAAAPAPAAAAPAVSNELL 
EKAETWMEVLAAKTGYETDMIESDMELETELGIDSIKRVEILSEVQAMLNVEAKDVDA 
LSRTRTVGEWDAMKAEIAGSSAPAPAAAAPAPAAAAPAPAAAAPAVSSELLEKAETW 
MEVLAAKTGYETDM I ESDMELETELG I DS I KRVE I LSEVQAMLNVEAKDVDALSRTRTV 
GEWDAMKAE I AGGSAPAPAAAAPAPAAAAPAVSNELLEKAETWMEVLAAKTG-YETDM 
lESDMELETELGIDSIKRVEILSEVQAMLNVEAKDVDALSRTRTVGEWDAMKAEIAGG 
SAPAPAAAAPASAGAAPAVKIDSVHGADCDDLSLMHAKWDIRRPDELILERPENRPVL 
WDDGSELTLALVRVLGACAWLTFEGLQLAQRAGAAAIRHVLAKDLSAESAEKAIKEA 
EQRFGALGGFISQQAERFEPAEILGFTLMCAKFAKASLCTAVAGGRPAFIGVARLDGRL 
GFTSQGTSDALKRAQRGAIFGLCKTIGLEWSESDVFSRGVDIAQGMHPEDAAVAIVREM 
ACADIRIREVGIGANQQRCTIRAAKLETGNPQRQIAKDDVLLVSGGARGITPLCIREIT 
RQIAGGKYILLGRSKVSASEPAWCAGITDEKAVQKAATQELKRAFSAGEGPKPTPRAVT 
KLVGS VLGAREVRS S I AAI EALGGKAI YSSCDVNSAADVAKAVRDAESQLGARVSG I VH 
ASGVLRDRLIEKKLPDEFDAVFGTKVTGLENLLAAVDRANLKHMVLFSSLAGFHGNVGQ 
SDYAMANEALNKMGLELAKDVSVKSICFGPWDGGMVTPQLKKQFQEMGVQIIPREGGAD 
TVAR I VLGSS P AE I LVGNWRTPS KKVGSDTI TLHRKI SAKSNPFLEDHVIQGRRVLPMT 
LAIGSLAETCLGLFPGYSLWAIDDAQLFKGVTVDGDVNCEVTLTPSTAPSGRVNVQATL 
KTFSSGKLVPAYRAVIVLSNQGAPPANATMQPPSLDADPALQGSVTDGKTLFHGPAFRG 
IDDYLSCTKSQLVAKCSAVPGSDAARGEFATDTDAHDPFVNDLAFQAMLVWVRRTLGQA 
ALPNSIQRIVQHRPVPQDKPFYITLRSNQSGGHSQHKHALQFHNEQGDLFIDVQASVIA 

TDSLAF 
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AVFEEHDPSNAACTGHDSISALSARCGGESNMRIAITGMDATFGALKGLDAFERAIYTG 
AHGAIPLPEKRWRFLGKDKDFLDLCGVKATPHGCYIEDVEVDFQRLRTPMTPEDMLLPQ 
QLLAVTTIDRAILDSGMKKGGNVAVFVGLGTDLELYRHRARVALKERVRPEASKKLNDM 
MQY I NDCGTSTS YTS Y I GNLVATRVSSQWGFTGPS FT I TEGNNSVYRCAELGKYLLETG 
EVDGVWAGVDLCGSAENLYVKSRRFKVSTSDTPRASFDAAADGYFVGEGCGAFVLKRE 
TSCTKDDRIYACMDAIVPGNVPSACLREALDQARVKPGDIEMLELSADSA'RHLKDPSVL 
PKELTAEEEIGGLQTILRDDDKLPRNVATGSVKATVGDTGYASGAASLIKAALCIYNRY 
LPSNGDDWDEPAPEAPWDSTLFACQTSRAWLKNPGERRYAAVSGVSETRSCYSVLLSEA 
EGHYERENRISLDEEAPKLIVLRADSHEEILGRLDKIRERFLQPTGAAPRESELKAQAR 
RIFLELLGETLAQDAASSGSQKPLALSLVSTPSKLQREVELAAKGIPRCLKMRRDWSSP 
AGSRYAPEPLASDRVAFMYGEGRSPYYGITQDIHRIWPELHEVINEKTNRLWAEGDRWV 
MPRASFKSELESQQQEFDRNMIEMFRLGILTSIAFTNLARDVLNITPKAAFGLSLGEIS 
MIFAFSKKNGLISDQLTKDLRESDVWNKALAVEFNALREAWGIPQSVPKDEFWQGYIVR 
GTKQDIEAAIAPDSKYVRLTIINDANTALISGKPDACKAAIARLGGNIPALPVTQGMCG 
HCPEVGPYTKDIAKIHANLEFPWDGLDLWTTINQKRLVPRATGAKDEWAPSSFGEYAG 
QLYEKQANFPQIVETIYKQNYDVFVEVGPNNHRSTAVRTTLGPQRNHIiAGAIDKQNEDA 
WTT I VKLVAS LKAHLVPGVTI S PLYHS KLVAEAQ ACYAALCKGE KP KKNKF VRKI QLNG 
RFNSKADPISSADLASFPPADPAIEAAISSRIMKPVAPKFYARLNIDEQDETRDPILNK 
DNAPSSSSSSSSSSSSSSSPSPAPSAPVQKKAAPAAETKAVASADALRSALLDLDSMLA 
LSSASASGNLVETAPSDASVIVPPCNIADLGSRAFMKTYGVSAPLYTGAMAKGIASADL 
VIAAGRQGILASFGAGGLPMQWRESIEKIQAALPNGPYAVNLIHSPFDSNLEKGNVDL 
FLEKGVTFVEASAFMTLTPQWRYRAAGLTRNADGSVNIRNRIIGiCVSRTELAEMFMRP 
APEHLLQKLIASGEINQEQAELARRVPVADDIAVEADSGGHTDNRPIHVILPLIINLRD 
RLHRECGYPANLRVRVGAGGGIGCPQAALATFNMGASFIVTGTVNQVAKQSGTCDNVRK 
QLAKATYSDVCMAPAADMFEEGVKLQVLKKGTMFPSRANKLYELFCKYDSFESMPPAEL 
ARV-EKRIFSRALEEVWDETKNFYINRLHNPEKIQRAERDPKLKMSLCFRWYLSLASRWA 
NTGASDRVMDYQVWCGPAIGSFNDFIKGTYLDPAVANEYPCWQINKQILRGACFLRRL 
E I LRNARLSDGAAALVAS I DDT YVPAEKL 
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RAEAGREPEPAPQITSTAAESQQQQQQQQQQQQQQQPREGDKEKAAETMALRVKTNKKPCWEMT 

KEELTSGKTE VFN YEELLEF AEGD I AKVFG PE FAV I DKYPRRVRLPAREYLLVTRVTLMDAEVN 
NYRVGARMVTEYDLPVNGELSEGGDCPWAVLVESGQCDLMLISYMGIDFQNQGDRVYRLLNTTL 
TFYGVAHEGETLEYDIRVTGFAKRLDGGISMFFFEYDCYVNGRLLIEMRDGCAGFFTNEELDAG 
KGWFTRGDLAARAKIPKQDVSPYAVAPCLHKTKLNEKEMQTLVDKDWASVFGSKNGMPEINYK 
LCARKMLMIDRVTSIDHKGGVYGLGQLVGEKILERDHWYFPCHFVKDQVMAGSLVSDGCSQMLK 
MYMIWLGLHLTTGPFDFRPVNGHPNKVRCRGQISPHKGKLVYVMEIKEMGFDEDNDPYAIADVN 
IIDVDFEKGQDFSLDRISDYGKGDLNKKIWDFKGIALKMQKRSTNKNPSKVQPVFANGAATVG 
PEASKASSGASASASAAPAKPAFSADVLAPKPVALPEHILKGDALAPKEMSWHPMARIPGNPTP 
SFAPSAYKPRNIAFTPFPGNPNDNDHTPGKMPLTWFN^4AEFMAGKVSMCLGPEFAKFDDSNTSR 
SPAWDLAliVTRAVSVSDLKHVNYRNIDLDPSKGTMVGEFDCPADAWFYKGACNDAHMPYSILME 
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glJcrcttac aaagaaacta tctcaatgcg aatttaacct taattccgtt taattacggc 60 
ctgatagagc atcacccaat cagccataaa actgtaaagt gggtactcaa aggtggctgg 120 
gcgattcttc tcaaatacaa agtgcccaac ccaagcaaat ccatatccga taacaggtaa IBO 
aagtagcaat aaaceccagc gctgagttag taatacataa gcgaataata ggatcactaa 240 
actactgccg aaatagtgca atattcgaca gtttctatgc cgatgttgag ataaataaaa 300 
agggtaaaat tcagcaaaag aacgatagcg cttacteatt actcacaccc cggtaaaaaa 360 
gcaactcgcc attaacttgg ccaatcgtca gttgttctat cgtctcaaag ttatgccgac 420 
taaataactc tatatgtgca ctacgattag caaaaactcc gataccatca agatgaagtt 4B0 
gttcatcaca ccaactcaaa actgcgtcga taagcttact gccatagccc tcgccttgct 540 
ccacatttgc gatagcaata aaccgcaaaa tgccacattg gccacttggt aagctctcta 600 
taatctgatt ttctttgtta ataagtgcct gagttgaata ccaaccagta ctcaacaaca 660 
tctttaaacg ccaatgccaa aaacgcgctt cacctaaggg aacctgctga gtcactatgc 720 
aggctacgcc tatcaatcta cecccaacga acataccaat aagtgcttgc tcctgttgcc 7B0 
agagctcatt gagctcttct cgaacagccc cgcgaagcct ttgctcatac tgcgcttgat 840 
caccactaaa aagtgttccg ataaaaaagg gatcatcatg ataggcgtta cagagaatag 900 
aggctgctat gcgtaaatct tctgccgtga gataaactgc acgacactct tccatggctt 960 
gatcttccat tgttattgtc cttgaccttg atcacacaac accaacgtaa caagactgta 1O20 
tagaagtgca at-.taataatc aacccgtgca ctaagcaggt cagcattcct tcgccaaaca 1080 
agctctattg gcitttgacaa aacttcgcct agacctraac gacagaaacc ataatgaa^g 1140 
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agaaaagcta caacctagag gggaataatc aaacaactgc taagatctag ataatgtaat 1200 
aaacaccgag tttatcgacc atacttagat agagtcatag caacgagaat agttatggat 1260 
acaacgccgc aagatctatc acacctgtrt ttacagctag gattagcaaa tgatcaaccc 1320 
gcaattgaac agtttatcaa tgaccatcaa ttagcggaca atatattgct acatcaagca 1380 
agcttttgga gcccatcgca aaagcacttc ttaattgagt catttaatga agatgcccag 1440 
tggaccgaag tcatcgacca cttagacacc ttattaagaa aaaactaacc attacaacag 1500 
caactttaaa ttttgccgta agccatctcc ccccacccca caacagcgtt gttgcttatg 1560 
accactggag tacattcgtc tttagtcgtt ttaccatcac catgggtacg ttgagtgcga 1620 
taaaaaagca cataaacttc tttatcggcc tgaatatagg cttcgttaaa atcagctgtt 1680 
cccattaaag taaccacttg ctctttactc atgcctagag atatctttgt caaattgtca 1740 
cggtttttat cttgagtttt ctcccaagca ccgtgattat cccagtcaga ttccccatca 1800 
ccaacattga ccacacagcc cgttagccct aagcttgcaa tcccaaaaca tgctaaacct 18 60 
aataatttat ttttcatttt aacttcctgt tatgacatta tttttgctta gaagaaaagc 1920 
aacttacatg ccaaaacaca agctgttgtt ttaaatgact ttatttatta ttagcctttt 1980 
aggatatgcc tagagcaata ataattacca atgtttaagg aatttgacta actatgagtc 2040 
cgattgagca agtgctaaca gctgctaaaa aaatcaatga acaaggtaga gaaccaacat 2100 
tagcattgat taaaaccaaa cttggtaata gcatcccaat gcgcgagtta atccaaggtt 2160 
tgcaacagtt taagtctatg agtgcagaag aaagacaagc aatacctagc agcttagcaa 2220 
cagcaaaaga aactcaatat ggtcaatcaa gcttatctca atctgaacaa gctgatagga 2280 
tcctccagct agaaaacgcc ctcaatgaat taagaaacga atttaatggg ctaaaaagtc 2340 
aatttgataa cttacaacaa aacctgatga ataaagagcc tgacaccaaa tgcatgtaat 2400 
tgaactacga tttgaatgtt ttgataacac cacgattact gcagcagaaa aagccattaa 24 60 
tggtttgctt gaagcttatc gagccaatgg ccaggttcta ggtcgtgaat ttgccgttgc 2520 
at-ttaacgat ggtgagttta aagcacgcat gttaacccca gaaaaaagca gcttatctaa 2580 
acgctttaat agtccttggg taaatagtgc actcgaagag ctaaccgaag ccaaattgct 2640 
tgcgccacgt gaaaagtata ttggccaaga tattaattct gaagcatcta gccaagacac 2700 
accaagttgg cagctacttt acacaagtta tgtgcacatg tgctcaccac taagaaatgg 27 60 
cgacaccttg cagcctattc cactgtatca aattccagca actgccaacg gcgatcataa 2820 
acgaatgatc cgttggcaaa cagaatggca agcttgtgat gaattgcaaa tggccgcagc 2880 
tactaaagct gaatttgccg cacttgaaga gctaaccagt catcagagtg atctatttag 2 940 
gcgtggttgg gacttacgtg gcagagtcga atacttgacg aaaattccga cctattacta 3000 
tttataccgt gttggcggtg aaagcttagc agtagaaaag cagcgctctt gtcctaagtg 3060 
tggcagtcaa gaatggctgc tcgataaacc attattggat atgttccatt ttcgctgtga 3120 
cacctgccgc atcgtatcta atatctcttg ggaccattta taactcttcc gagtcttatc 3180 
acactagagt ttagtcagca taaaaatggc gcttatattt caattaaaag aaatataagc 3240 
gccattttca tcgatactat atatcagcag actattttcc gcgtaaatta gcccacatta 3300 
atttcattct ttgccagatc cctggatgat ctagttgtgg catcgactct tcaataggtt 3360 
taaccgcagg tgtaaccctt ggagtcaatt cgtttataaa ctcgtttaaa ctgtcactta 3420 
atttaacgct ttgtacttca cctggaattt caatccatac gctgccatca ctattattaa 3480 
ccgtcaacat tttatcttca tcatcaagaa taccaataaa ccaagtcggc tcttgcttaa 3540 
gctttctctt catcattaaa tgaccaatga tgttttgttg taagtattca aaatcagttt 3600 
gatcccacac ttggattagc tcaccttggc cccattgtga gtcaaaaaat agcggtgcag 3660 
aaaaatgact gccaaaaaat ggattaattt ctgcagataa tgtcatttca agtgctgttt 3720 
caacattagc aaattcacca ggttgttgac gtacaaccga ttgccaaaac actgcgccat 37 80 
cggagcccgc ttcggcgaca acacactcag acttttgtcc ttgcgcataa tatcttggct 3840 
gttcaccaag cttatccatg taggcttgtt gatatttaga taaaaaaaga tctaaagcag 3900 
gtaaagaaga cacttaagcc agttccaaaa tcagttataa taggggtcta ttttgacatg 3960 
gaaaccgtat tgatgacaca acatcatgat ccctacagta acgcccccga actttctgaa 4 020 
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ttaactttag gaaagtcgac cggttatcaa gagcagtatg atgcatcttt actacaagcg 4080 
tgccgcgtaa attaaaccgt gatgctatcg gtctaaccaa tgagctacct tttcatggct 4140 
gtgatatttg gactggctac gaactgtctt ggctaaatgc taaaggcaag ccaatgattg 4200 
ctattgcaga ctttaaccta agttttgata gtaaaaatct gatcgagtct aagtcgttta 4260 
agctgtattt aaacagctat aaccaaacac gatttgatag cgttcaagcg gttcaagaac 4320 
gtttaactga agacttaagc gcctgtgccc aaggcacagt tacggtaaaa gtgattgaac 4380 
ctaagcaatt taaccacctg agagtggttg atatgccagg tacctgcatt gacgatttag 4440 
atattgaagt tgatgactat agctttaact ctgactatct caccgacagt gttgatgaca 4500 
aagtcatggt tgctgaaacg ctaacgtcaa acttattgaa atcaaactgc ctaatcactt 4560 
ctcagcctga ctggggtaca gtgatgatcc gttatcaagg gcctaagata gaccgtgaaa 4620 
agctacttag atatctgatt tcatttagac agcacaatga atttcatgag cagtgtgttg 4680 
agcgtatatt tgttgattta aagcactatt gccaatgtgc caaacttact gtctatgcac 4740 
gttatacccg ccgtggtggt ttagatatca acccatatcg tagcgacttt gaaaaccctg 4800 
cagaaaatca gcgcctagcg agacagtaat tgattgcagt acctacaaaa aacaatgcct 4860 
ataagccaag cttatgggca tttttatatt atcaacttgt catcaaacct cagccgccaa 4920 
gccttttagt tttatcgcta aattaagccg ctctctcagc caaatatttg caggattttg 4980 
ctgtaattta tggctccaca ccatgaaata ctctatcggc tctaccgcaa aaggtaagtc 5040 
aaatacctgt aagccaaaca gcttggcata ttcgtcagtg tgggcttttg acgcgatagc 5100 
taacgcatca ctttttgagg caaccgacat catacttaat attgatgatt gctcgctgtg 5160 
catttgcctt gccggtaaca cctgtttagt cagcaagtcg gcaacactta aattgtagcg 5220 
gcgcatctta aaaataatat gcttttcatt aaagtattgc tcttgcgtca acccaccttg 5280 
gatccttggg tgagcatttc gtgccacaca aactaattta tcctgcatta ctttttgact 5340 
cttaaatgcc gcagattctg gcagccaaat atctaaggct aaatccacct tttctagttg 5400 
taggtccatc tgcaactctt cttcaatgag cggcggctca cgaaatacaa tattaattgc 5460 
agtgccctgt aacacttgct caatttgatc ttgcaagagt tgtattgccg actcgctggc 5520 
atacacataa aaagttcgct cacttgaagt ggggtcaaat gcttcaaagc tagtcgcaac 5580 
ttgctcaatt gttgacatag cgcccgcgag ctgttgataa agcgtcatcg cacttgcggt 5640 
aggtttaact cccctaccca ctcgagtaaa caactcttct ccaacaatac tttttagcct 5700 
cgaaatcgca ttactaaccg acgactgagt caaatccagc tcttctgccg cccggctaaa 5760 
agatgaggtg cgatacaccg cagtaaaaac gcgaaataaa ttaagatcaa aagctttttg 5820 
ctgcgacata aatcagctat ctccttatcc ttatccttat ccttataaaa agttagctcc 5880 
agagcactct agctcaaaaa caactcagcg tattaagcca atattttggg aactcaatta 5940 
atattcataa taaaagtatt cataatataa ataccaagtc ataatttagc cctaattatt 6000 
aatcaattca agttacctat actggcctca attaagcaaa tgtctcatca gtctccctgc 6060 
aactaaatgc aatattgaga cataaagctt tgaactgatt caatcttacg agggtaactt 6120 
atgaaacaga ctctaatggc tatctcaatc atgtcgcttt tttcattcaa tgcgctagca 6180 
gcgcaacatg aacatgacca catcactgtt gattacgaag ggaaagccgc aacagaacac 6240 
accatagctc acaaccaagc tgtagctaaa acacttaact ttgccgacac gcgtgcattt 6300 
gagcaatcgt ctaaaaatct agtcgccaag tttgataaag caactgccga tatattacgt 6360 
gccgaatttg cttttattag cgatgaaatc cctgactcgg ttaacccgtc tctctaccgt 6420 
caggctcagc ttaatatggt gcctaatggt ctgtataaag tgagcgatgg catttaccag 6480 
gtccgcggta ccgacttatc taaccttaca cttatccgca gtgataacgg ttggatagca 6540 
tacgatgttt tgttaaccaa agaagcagca aaagcctcac tacaatttgc gttaaagaat 6600 
ctacctaaag atggcgattt acccgttgtt gcgatgattt actcccatag ccatgcggac 6660 
cactttggcg gagctcgcgg tgttcaagag atgttccctg atgtcaaagt ctacggctca 6720 
gataacatca ctaaagaaat tgtcgatgag aacgtacttg ccggtaacgc catgagccgc 6780 
cgcgcagctt atcaatacgg cgcaacactg ggcaaacatg accacggtat tgttgatgct 68 40 
gcgctaggta aaggtctatc aaaaggtgaa atcacttacg tcgccccaga ctacacctta 6900 
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aacagtgaag gcaaatggga aacgctgacg attgatggtc tagagatggt gtttatggat 6960 
gcctcgggca ccgaagctga gtcagaaatg atcacttata ttccctctaa aaaagcgctc 7020 
tggacggcgg agcttaccta tcaaggtat^ cacaacattt atacgctgcg cggcgctaaa 7080 
gtacgtgatg cgctcaagtg gtcaaaagat atcaacgaaa tgatcaatgc ctttggtcaa 7140 
gatgtcgaag tgctgtttgc ctcgcactct gcgccagtgt ggggtaacca agcgatcaac 7200 
gatttcttac gcctacagcg tgataactac ggcctagtgc acaatcaaac cttgagactt 7260 
gccaacgatg gtgtcggtat acaagatatt ggcgatgcga ttcaagacac gattccagag 7320 
tctatctaca agacgtggca taccaatggt taccacggca cttatagcca taacgctaaa 7380 
gcggtttata acaagtatct aggctacttc gatatgaacc cagccaacct taatccgctg 7440 
ccaaccaagc aagaatctgc caagtttgtc gaatacatgg gcggcgcaga tgccgcaatt 7500 
aagcgcgcta aagatgatta cgctcaaggt gaataccgct ttgttgcaac ggcattaaat 7560 
aaggtggtga tggccgagcc agaaaatgac tccgctcgtc aattgctagc cgatacctat 7620 
gagcaacttg gttatcaagc agaaggggct ggctggagaa acatttactt aactggcgca 7680 
caagagctac gagtaggtat tcaagctggc gcgcctaaaa ccgcatcggc agatgtcatc 7740 
agtgaaatgg acatgccgac tctatttgac ttcctcgcgg tgaagattga tagtcaacag 7800 
gcggctaagc acggcttagt taagatgaat gttatcaccc ctgatactaa agatattctc 7860 
tatattgagc taagcaacgg taacttaagc aacgcagtgg tcgacaaaga gcaagcagct 7 920 
gacgcaaacc ttatggttaa taaagctgac gttaaccgca tcttacttgg ccaagtaacc 7980 
ctaaaagcgt tattagccag cggcgatgcc aagctcactg gtgataaaac ggcatttagt 8040 
aaaatagccg atagcatggt cgagtttaca cctgacttcg aaatcgtacc aacgcctgtt 8100 
aaatgaggca ttaatctcaa caagtgcaag ctagacataa aaatggggcg attagacgcc 8160 
ccatttttta tgcaattttg aactagctag tcttagctga agctcgaaca acagctttaa 8220 
aattcacttc ttctgctgca atacttattt gctgacactg accaatactc agtgcaaaac 8280 
gataactatc atcaagatgg cccagtaaac aatgccaatt atcagcagcg ttcatttgct 8340 
gttctttagc ctcaatcaaa cctaaaccag acttttgtgg ctcagcgtta ggcttattag 8 400 
aactcgactc tagtaaagca agaccaatat cttgttttaa caaaacctgt cgctgattaa 8460 
gttgatgctc aaccttgtga tccgcaatag catcggaaat atcaacacaa tggctcaagc 8 520 
ttttaggtgc attaactcca agaaaagttt cgctcagtgc agagaagtca aacgcaaaag 8580 
attttagcga taatgccagc ccaagtcctt tcgctttaat gtaagactcc ttgagcgccc 8 640 
acaaatcaaa,aaagcggtct cgctgcaagg cctctggtaa cgctaacaag gctcgctttt 8700 
ctgattcaga gaaataatga ctaagaatag agtggatatt ggtgctgtta cggcaacgct 8760 
caatgtcgac gccaaactca atactagcag agtcagtttc ctccttgctt gcctgactgg 8 820 
cgcctttatt atcagcagtg caaatgccta ctaatagcca atctccacta tgactcacat 8880 
taaagtggac cccggtttga gcaaattgcg catcactcaa tctaggctta cctttgtcgc 8940 
catattcaaa gcgccattca ttggggcgta tttcactatg ttgtgacaat aaagcgcgca 9000 
aatagcctct taccattaaa ccttgagttt tagcttcttg tttaatgtag cgattaacct 9060 
taattaactc atcttcaggc agccatgact taaccaactc tgtagtctgg ttatcgcact 9120 
cttgtattgt taacggacag aagtataagg aaatcaatcg agaagttagc aatttttcag 9180 
gacactcttt aaagcaacaa acataacccc tatttttacc aatttaagat caaaactaaa 9240 
gccaaaacta attgagaata gtgtcaaact agctttaaag gaaaaaaata taaaaagaac 9300 
attatacttg tataaattat tttacacacc aaagccatga tcttcacaaa attagctccc 9360 
tctccctaaa acaagattga ataaaaaaat aaaccttaac tttcatatag ataaaacaaa 9420 
ccaatgggat aaagtatatt gaattcattt ttaaggaaaa attcaaattg aattcaagct 9480 
cttcagtaaa agcatatttt gccgttagtg tgaaaaaaaa caaatttaaa aaccaacata 9540 
gaacaaataa gcagacaata aaaccaaggc gcaacacaaa caacgcgctt acaattttca 9600 
caaaaaagca acaagagtaa cgtttagtat ttggatatgg ttattgtaat tgagaatttt 9660 
ataacaatta tattaaggga atgagtatgt ttttaaattc aaaactttcg cgctcagtca 9720 
aacttgccat atccgcaggc ttaacagcct cgctagctat gcctgttttt gcagaagaaa 9780 
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ctgctgctga agaacaaata gaaagagtcg cagtgaccgg atcgcgaatc gctaaagcag 9840 
agctaactca accagctcca gtcgtcagcc tttcagccga agaactgaca aaatttggta 9900 
atcaagattt aggtagcgta ctagcagaaX tacctgctat tggtgcaacc aacactatta 9960 
ttggtaataa caatagcaac tcaagcgcag gtgttagctc agcagacttg cgtcgtctag 10020 
gtgctaacag aaccttagta ttagtcaacg gtaagcgcta cgttgccggc caaccgggct 10080 
cagctgaggt agatttgtca actataccaa ctagcatgat ctcgcgagtt gagattgtaa 10140 
ccggcggtgc ttcagcaatt tatggttcgg acgctgtatc aggtgttatc aacgttatcc 10200 
ttaaagaaga ctttgaaggc tttgagttta acgcacgtac tagcggttct actgaaagtg 10260 
taggcactca agagcactct tttgacattt tgggtggtgc aaacgttgca gatggacgtg 10320 
gtaatgtaac cttctacgca ggttatgaac gtacaaaaga agtcatggct accgacattc 10380 
gccaattcga tgcttgggga acaattaaaa acgaagccga tggtggtgaa gatgatggta 10440 
ttccagacag actacgtgta ccacgagttt attctgaaat gattaatgct accggtgtta 10500 
tcaatgcatt tggtggtgga attggtcgct caacctttga cagtaacggc aatcctattg 10560 
cacaacaaga acgtgatggg actaacagct ttgcatttgg ttcattccct aatggctgtg 10620 
acacatgttt caacactgaa gcatacgaaa actatattcc aggggtagaa agaataaacg 10680 
ttggctcatc attcaacttt gattttaccg ataacattca attttacact gacttcagat 10740 
atgtaaagtc agatattcag caacaatttc agccttcatt ccgttttggt aacattaata 10800 
tcaatgttga agataacgcc tttttgaatg acgacttgcg tcagcaaatg ctcgatgcgg 10860 
gtcaaaccaa tgctagtttt gccaagtttt ttgatgaatt aggaaatcgc tcagcagaaa 10920 
ataaacgcga acttttccgt tacgtaggtg gctttaaagg tggctttgat attagcgaaa 10980 
ccatatttga ttacgacctt tactatgttt atggcgagac taataaccgt cgtaaaaccc 11040 
ttaatgacct aattcctgat aactttgtcg cagctgtcga ctctgttatt gatcctgata 11100 
ctggcttagc agcgtgtcgc tcacaagtag caagcgctca aggcgatgac tatacagatc 11160 
ccgcgtctgt aaatggtagc gactgtgttg cttataaccc atttggcatg ggtcaagctt 11220 
cagcagaagc ccgcgactgg gtttctgctg atgtgactcg tgaagacaaa ataactcaac 11280 
aagtgattgg tggtactctc ggtaccgatt ctgaagaact atttgagctt caaggtggtg 11340 
caatcgctat ggttgttggt tttgaatacc gtgaagaaac gtctggttca acaaccgatg 11400 
aatttactaa agcaggtttc ttgacaagcg ctgcaacgcc agattcttat ggcgaatacg 11460 
acgtgactga gtattttgtt gaggtgaaca tcccagtact aaaagaatta ccttttgcac 11520 
atgagttgag ctttgacggt gcataccgta atgctgatta ctcacatgcc ggtaagactg 11580 
aagcatggaa agctggtatg ttctactcac cattagagca acttgcatta cgtggtacgg 11640 
taggtgaagc agtacgagca ccaaacattg cagaagcctt tagtccacgc tctcctggtt 11700 
ttggccgcgt ttcagatcca tgtgatgcag ataacattaa tgacgatccg gatcgcgtgt 11760 
caaactgtgc agcattgggg atccctccag gattccaagc taatgataac gtcagtgtag 11820 
ataccttatc tggtggtaac ccagatctaa aacctgaaac atcaacatcc tttacaggtg ,11880 
gtcttgtttg gacaccaacg tttgctgaca atctatcatt cactgtcgat tattatgata 11940 
ttcaaattga ggatgctatt ttgtcagtag ccacccagac tgtggctgat aactgtgttg 12000 
actcaactgg cggacctgac accgacttct gtagtcaagt tgatcgtaat ccaacgacct 12060 
atgatattga acttgttcgc tctggttatc taaatgccgc ggcattgaat accaaaggta 12120 
ttgaatttca agctgcatac tcattagatc tagagtcttt caacgcgcct ggtgaactac 12180 
gcttcaacct attggggaac caattacttg aactagaacg tcttgaattc caaaatcgtc 12240 
ctgatgagat taatgatgaa aaaggcgaag taggtgatcc agagctgcag ttccgcctag 12300 
gcatcgatta ccgtctagat gatctaagtg ttagctggaa cacgcgttat attgatagcg 12360 
tagtaactta tgatgtctct gaaaatggtg gctctcctga agatttatat ccaggccaca 12420 
taggctcaat gacaactcat gacttgagcg ctacatacta catcaatgag aacttcatga 12480 
ttaacggtgg tgtacgtaac ctatttgacg cacttccacc tggatacact aacgatgcgc 12540 
tatatgatct agttggtcgc cgtgcattcc taggtattaa ggtaatgatg taattaatta 12600 
ttacgcctct aactaataaa aatgcaatct cttcgtagag attgcatttt tttatgaaat 12660 
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ccaatcttaa actggttctc cgagcatctt acgccttaaa aaccccgccc ctcaatgtaa 12720 
cgccaaagtt aattgcttac acgcacttac acaaacgaac aatttcatta acacgagaca 12780 
cagctcacgc tttttatttt acccttgatX ttactacata aaattgcgtt ttagcgcaca 12840 
agtgttctcc caagctggtc gtatctgtaa ttattcagtc ccaggtgatt gtattgaccc 12900 
ataagctcag gtagtctgct ctgccattag ctaaacaata ttgacaaaat ggcgataaaa 12960 
tgtggcttag cgctaagttc accgtaagtt ttatcggcat taagtcccaa cagattatta 13020 
acggaaaccc gctaaactga tggcaaaaat aaatagtgaa cacttggatg aagctactat 13080 
tacttcgaat aagtgtacgc aaacagagac tgaggctcgg catagaaatg ccactacaac 13140 
acctgagatg cgccgattca tacaagagtc ggatctcagt gttagccaac tgtctaaaat 13200 
attaaatatc agtgaagcta ccgtacgtaa gtggcgcaag cgtgactctg tcgaaaactg 13260 
tcctaatacc ccgcaccatc tcaataccac gctaacccct ttgcaagaat atgtggttgt 13320 
gggcctgcgt tatcaattga aaatgccatt agacagattg ctcaaagcaa cccaagagtt 13380 
tatcaatcca aacgtgtcgc gctcaggttt agcaagatgt ttgaagcgtt atggcgtttc 13440 
acgggtgagt gatatccaaa gcccacacgt accaatgcgc tactttaatc aaattccagt 13500 
cactcaaggc agcgatgtgc aaacctacac cctgcactat gaaacgctgg caaaaacctt 13560 
agccttacct agtaccgatg gtgacaatgt ggtgcaagtg gtgtctctca ccattccacc 13620 
aaagttaacc gaagaagcac ccagttcaat tttgctcggc attgatcctc atagcgactg 13680 
gatctatctc gacatatacc aagatggcaa tacacaagcc acgaatagat atatggctta 13740 
tgtgctaaaa cacgggccat tccatttacg aaagttactc gtgcgtaact atcacacctt 13800 
tttacagcgc tttcctggag cgacgcaaaa tcgccgcccc tctaaagata tgcctgaaac 138 60 
aatcaacaag acgcctgaaa cacaggcacc cagtggagac tcataatgag ccagacctct 13920 
aaacctacaa actcagcaac tgagcaagca caagactcac aagctgactc tcgtttaaat 13980 
aaacgactaa aagatatgcc aattgctatt gttggcatgg cgagtatttt tgcaaactct 14040 
cgctatttga ataagttttg ggacttaatc agcgaaaaaa ttgatgcgat tactgaatta 14100 
ccatcaactc actggcagcc tgaagaatat tacgacgcag ataaaaccgc agcagacaaa 14160 
agctactgta aacgtggtgg ctttttgcca gatgtagact tcaacccaat ggagtttggc 14220 
ctgccgccaa acattttgga actgaccgat tcatcgcaac tattatcact catcgttgct 14280 
aaagaagtgt tggctgatgc taacttacct gagaattacg accgcgataa aattggtatc 14340 
accttaggtg tcggcggtgg tcaaaaaatt agccacagcc taacagcgcg tctgcaatac 14400 
ccagtattga agaaagtatt cgccaatagc ggcattagtg acaccgacag cgaaatgctt 14460 
atcaagaaat tccaagacca atatgtacac tgggaagaaa actcgttccc aggttcactt 14520 
ggtaacgtta ttgcgggccg tatcgccaac cgcttcgatt ttggcggcat gaactgtgtg 14580 
gttgatgctg cctgtgctgg atcacttgct gctatgcgta tggcgctaac agagctaact 14 640 
gaaggtcgct ctgaaatgat gatcaccggt ggtgtgtgta ctgataactc accctctatg 14700 
tatatgagct tttcaaaaac gcccgccttt accactaacg aaaccattca gccatttgat . 14760 
atcgactcaa aaggcatgat gattggtgaa ggtattggca tggtggcgct aaagcgtctt 14820 
gaagatgcag agcgcgatgg cgaccgcatt tactctgtaa ttaaaggtgt gggtgcatca 14880 
tctgacggta agtttaaatc aatctatgcc cctcgcccat caggccaagc taaagcactt 14940 
aaccgtgcct atgatgacgc aggttttgcg ccgcatacct taggtctaat tgaagctcac 15000 
ggaacaggta ctgcagcagg tgacgcggca gagtttgccg gcctttgctc agtatttgct 15060 
gaaggcaacg ataccaagca acacattgcg ctaggttcag ttaaatcaca aattggtcat 15120 
actaaatcaa ctgcaggtac agcaggttta attaaagctg ctcttgcttt gcatcacaag 15180 
gtactgccgc cgaccattaa cgttagtcag ccaagcccta aacttgatat cgaaaactca 15240 
ccgttttatc taaacactga gactcgtcca tggttaccac gtgttgatgg tacgccgcgc 15300 
cgcgcgggta ttagctcatt tggttttggt ggcactaact tccattttgt actagaagag 15360 
tacaaccaag aacacagccg tactgatagc gaaaaagcta agtatcgtca acgccaagtg 15420 
gcgcaaagct tccttgttag cgcaagcgat aaagcatcgc taattaacga gttaaacgta 15480 
ctagcagcat ctgcaagcca agctgagttt atcctcaaag atgcagcagc aaactatggc 15540 
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gtacgtgagc ttgataaaaa tgcaccacgg atcggtttag ttgcaaacac agctgaagag 15600 
ttagcaggcc taattaagca agcacttgcc aaactagcag ctagcgatga taacgcatgg 15660 
cacctacctg gtggcactag ctaccgcgcc gctgcagtag aaggtaaagt tgccgcactg 15720 
tttgctggcc aaggttcaca atatctcaat atgggccgtg accttacttg ttattaccca 15780 
gagatgcgtc agcaatttgt aactgcagat aaagtatttg ccgcaaatga taaaacgccg 15840 
ttatcgcaaa ctctgtatcc aaagcctgta tttaataaag atgaattaaa ggctcaagaa 15900 
gccattttga ccaataccgc caatgcccaa agcgcaattg gtgcgatttc aatgggtcaa 15960 
tacgatttgt ttactgcggc tggctttaat gccgacatgg ttgcaggcca tagctttggt 16020 
gagctaagtg cactgtgtgc tgcaggtgtt atttcagctg atgactacta caagctggct 16080 
tttgctcgtg gtgaggctat ggcaacaaaa gcaccggcta aagacggcgt tgaagcagat 16140 
gcaggagcaa tgtttgcaat cataaccaag agtgctgcag accttgaaac cgttgaagcc 16200 
accatcgcta aatttgatgg ggtgaaagtc gctaactata acgcgccaac gcaatcagta 16260 
attgcaggcc caacagcaac taccgctgat gcggctaaag cgctaactga gcttggttac 16320 
aaagcgatta acctgccagt atcaggtgca ttccacactg aacttgttgg tcacgctcaa 16380 
gcgccarttg ctaaagcgat tgacgcagcc aaatttacta aaacaagccg agcactttac 16440 
tcaaatgcaa ctggcggact ttatgaaagc actgctgcaa agattaaagc ctcgtttaag 16500 
aaacatatgc ttcaatcagt gcgctttact agccagctag aagccatgta caacgacggc 16560 
gcccgtgtat ttgttgaatt tggtccaaag aacatcttac aaaaattagt tcaaggcacg 16620 
cttgtcaaca ctgaaaatga agtttgcact atctctatca accctaatcc taaagttgat 16680 
agtgatctgc agcttaagca agcagcaatg cagctagcgg ttactggtgt ggtactcagt 16740 
gaaattgacc cataccaagc cgatattgcc gcaccagcga aaaagtcgcc aatgagcatt 16800 
tcgcttaatg ctgctaacca tatcagcaaa gcaactcgcg ctaagatggc caagtcttta 16860 
gagacaggta tcgtcacctc gcaaatagaa catgttattg aagaaaaaat cgttgaagtt 16920 
ga-gaaactgg ttgaagtcga aaagatcgtc gaaaaagtgg ttgaagtaga gaaagttgtt 16980 
gaggttgaag ctcctgttaa ttcagtgcaa gccaatgcaa ttcaaacccg ttcagttgtc 17040 
gctccagtaa tagagaacca agtcgtgtct aaaaacagta agccagcagt ccagagcatt 17100 
agtggtgatg cactcagcaa cttttttgct gcacagcagc aaaccgcaca gttgcatcag 17160 
cagttcttag ctattccgca gcaatatggt gagacgttca ctacgctgat gaccgagcaa 17220 
gctaaactgg caagttctgg tgttgcaatt ccagagagtc tgcaacgctc aatggagcaa 17280 
ttccaccaac tacaagcgca aacactacaa agccacaccc agttccttga gatgcaagcg 17340 
ggtagcaaca ttgcagcgtt aaacctactc aatagcagcc aagcaactta cgctccagcc 17400 
attcacaatg aagcgattca aagccaagtg gttcaaagcc aaactgcagt ccagccagta 17460 
atttcaacac aagttaacca tgtgtcagag cagccaactc aagctccagc tccaaaagcg 17520 
cagccagcac ctgtgacaac tgcagttcaa actgctccgg cacaagttgt tcgtcaagcc 17580 
gcaccagttc aagccgctat tgaaccgatt aatacaagtg ttgcgactac aacgccttca ^ 17640 
gccttcagcg ccgaaacagc cctgagcgca acaaaagtcc aagccactat gcttgaagtg 17700 
gttgctgaga aaaccggtta cccaactgaa atgctagagc ttgaaatgga tatggaagcc 17760 
gatttaggca tcgartctat caagcgtgta gaaattcttg gcacagtaca agatgagcta 17820 
ccgggtctac ctgagcttag ccctgaagat ctagctgagt gtcgaacgct aggcgaaatc 17880 
gttgactata tgggcagtaa actgccggct gaaggctcta tgaattctca gctgtctaca 17940 
ggttccgcag ctgcgactcc tgcagcgaat ggtctttctg cggagaaagt tcaagcgact 18000 
atgatgtctg tggttgccga aaagactggc tacccaactg aaatgctaga gcttgaaatg 18060 
gatatggaag ccgatttagg catagattct atcaagcgcg ttgaaattct tggcacagta 18120 
caagatgagc taccgggtct acctgagctt agccctgaag atctagctga gtgtcgtact 18180 
ctaggcgaaa tcgttgacta tatgaactct aaactcgctg acggctctaa gctgccggct 18240 
gaaggctcta tgaattctca gctgtctaca agtgccgcag ctgcgactcc tgcagcgaat 18 300 
ggtctctctg cggagaaagt tcaagcgact atgatgtctg tggttgccga aaagactggc 18360 
tacccaactg aaatgctaga acttgaaatg gatatggaag ctgaccttgg catcgattca 18420 
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atcaagcgcg ttgaaattct tggcacagta caagatgagc taccgggttt acctgagcta 18480 
aatccagaag atttggcaga gtgtcgtact cttggcgaaa tcgtgactta tatgaactct 18540 
aaactcgctg acggctctaa gctgccagc't gaaggctcta tgcactatca gctgtctaca 18600 
agtaccgctg ctgcgactcc tgtagcgaat ggtctctctg cagaaaaagt tcaagcgacc 18 660 
atgatgtctg tagttgcaga taaaactggc tacccaactg aaatgcttga acttgaaatg 18720 
gatatggaag ccgatttagg tatcgattct atcaagcgcg ttgaaattct tggcacagta 18780 
caagatgagc taccgggttt acctgagcta aatccagaag atctagcaga gtgtcgcacc 18840 
ctaggcgaaa tcgttgacta tatgggcagt aaactgccgg ctgaaggctc tgctaataca 18900 
agtgccgctg cgtctcttaa tgttagtgcc gttgcggcgc ctcaagctgc tgcgactcct 18 960 
gtatcgaacg gtctctctgc agagaaagtg caaagcacta tgatgtcagt agttgcagaa 19020 
aagaccggct acccaactga aatgctagaa cttggcatgg atatggaagc cgatttaggt 19080 
atcgactcaa ttaaacgcgt tgagattctt ggcacagtac aagatgagct accgggtcta 19140 
ccagagctta atcctgaaga tttagctgag tgccgtacgc tgggcgaaat cgttgactat 19200 
atgaactcta agctggctga cggctctaag cttccagctg aaggctctgc taatacaagt 19260 
gccactgctg cgactcctgc agtgaatggt ctttctgctg acaaggtaca ggcgactatg 19320 
atgtctgtag ttgctgaaaa gaccggctac ccaactgaaa tgctagaact tggcatggat 19380 
atggaagcag accttggtat tgattctatt aagcgcgttg aaattcttgg cacagtacaa 19440 
gatgagctcc caggtttacc tgagcttaat cctgaagatc tcgctgagtg ccgcacgctt 19500 
ggcgaaatcg ttagctatat gaactctcaa ctggctgatg gctctaaact ttctacaagt 19560 
gcggctgaag gctctgctga tacaagtgct gcaaatgctg caaagccggc agcaatttcg 19620 
gcagaaccaa gtgttgagct tcctcctcat agcgaggtag cgctaaaaaa gcttaatgcg 19680 
gcgaacaagc tagaaaattg tttcgccgca gacgcaagtg ttgtgattaa cgatgatggt 19740 
cacaacgcag gcgttttagc tgagaaactt attaaacaag gcctaaaagt agccgttgtg 19800 
cg'tttaccga aaggtcagcc tcaatcgcca ctttcaagcg atgttgctag ctttgagctt 19860 
gcctcaagcc aagaatctga gcttgaagcc agtatcactg cagttatcgc gcagattgaa 19920 
actcaggttg gcgctattgg tggctttatt cacttgcaac cagaagcgaa tacagaagag 19980 
caaacggcag taaacctaga tgcgcaaagt tttactcacg ttagcaatgc gttcttgtgg 20040 
gccaaattat tgcaaccaaa gctcgttgct ggagcagatg cgcgtcgctg ttttgtaaca 20100 
gtaagccgta tcgacggtgg ctttggttac ctaaatactg acgccctaaa agatgctgag 20160 
ctaaaccaag cagcattagc tggtttaact aaaaccttaa gccatgaatg gccacaagtg 20220 
ttctgtcgcg cgctagatat tgcaacagat gttgatgcaa cccatcttgc tgatgcaatc 20280 
accagtgaac tatttgatag ccaagctcag ctacctgaag tgggcttaag cttaattgat 20340 
ggcaaagtta accgcgtaac tctagttgct gctgaagctg cagataaaac agcaaaagca 20400 
gagcttaaca gcacagataa aatcttagtg actggtgggg caaaaggggt gacatttgaa 20460 
tgtgcactgg cattagcatc tcgcagccag tctcacttta tcttagctgg gcgcagtgaa, 20520 
ttacaagctt taccaagctg ggctgagggt aagcaaacta gcgagctaaa atcagctgca 20580 
atcgcacata ttatttctac tggtcaaaag ccaacgccta agcaagttga agccgctgtg 20640 
tggccagtgc aaagcagcat tgaaattaat gccgccctag ccgcctttaa caaagttggc 20700 
gcctcagctg aatacgtcag catggatgtt accgatagcg ccgcaatcac agcagcactt 20760 
aatggtcgct caaatgagat caccggtctt attcatggcg caggtgtact agccgacaag 20820 
catattcaag acaagactct tgctgaactt gctaaagttt atggcactaa agtcaacggc 20880 
ctaaaagcgc tgctcgcggc acttgagcca agcaaaatta aattacttgc tatgttctca 20940 
tctgcagcag gtttttacgg taatatcggc caaagcgatt acgcgatgtc gaacgatatt 21000 
cttaacaagg cagcgctgca gttcaccgct cgcaacccac aagctaaagt catgagcttt 21060 
aactggggtc cttgggatgg cggcatggtt aacccagcgc ttaaaaagat gtttaccgag 21120 
cgtggtgtgt acgttattcc actaaaagca ggtgcagagc tatttgccac tcagctattg 21180 
gctgaaactg gcgtgcagtt gctcattggt acgtcaatgc aaggtggcag cgacactaaa 21240 
gcaactgaga ctgcttctgt aaaaaagctt aatgcgggtg aggtgctaag tgcatcgcat 21300 
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ccgcgtgctg gtgcacaaaa aacaccacta caagctgtca ctgcaacgcg tctgttaacc 21360 
ccaagtgcca tggtcttcat tgaagatcac cgcattggcg gtaacagtgt gttgccaacg 21420 
gtatgcgcca tcgactggat gcgtgaagcg gcaagcgaca tgcttggcgc tcaagttaag 21480 
gtacttgatt acaagctatt aaaaggcatt gtatttgaga ctgatgagcc gcaagagtta 21540 
acacttgagc taacgccaga cgattcagac gaagctacgc tacaagcatt aatcagctgt 21600 
aatgggcgtc cgcaatacaa ggcgacgctt atcagtgata atgccgatat taagcaactt 21660 
aacaagcagt ttgatttaag cgctaaggcg attaccacag caaaagagct ttatagcaac 21720 
ggcaccttgt tccacggtcc gcgtctacaa gggatccaat ctgtagtgca gttcgatgat 21780 
caaggcttaa ttgctaaagt cgctctgcct aaggttgaac ttagcgattg tggtgagttc 21840 
ttgccgcaaa cccacatggg tggcagtcaa ccttttgctg aggacttgct attacaagct 21900 
atgctggttt gggctcgcct taaaactggc tcggcaagtt tgccatcaag cattggtgag 21960 
tttacctcat accaaccaat ggcctttggt gaaactggta ccatagagct tgaagtgatt 22020 
aagcacaaca aacgctcact tgaagcgaat gttgcgctat atcgtgacaa cggcgagtta 22080 
agtgccatgt ttaagtcagc taaaatcacc attagcaaaa gcttaaattc agcattttta 22140 
cctgctgtct tagcaaacga cagtgaggcg aattagtgga acaaacgcct aaagctagtg 22200 
cgatgccgct gcgcatcgca cttatcttac tgccaacacc gcagtttgaa gttaactctg 22260 
tcgaccagtc agtattagcc agctatcaaa cactgcagcc tgagctaaat gccctgctta 22320 
3^^gtgcgcc gacacctgaa atgctcagca tcactatctc agatgatagc gatgcaaaca 22380 
gctttgagtc gcagctaaat gctgcgacca acgcaattaa caatggctat atcgtcaagc 22440 
ttgctacggc aactcacgct ttgttaatgc tgcctgcatt aaaagcggcg caaatgcgga 22500 
tccatcctca tgcgcagctt gccgctatgc agcaagctaa atcgacgcca atgagtcaag 22560 
tatctggtga gctaaagctt ggcgctaatg cgctaagcct agctcagact aatgcgctgt 22 620 
ctcatgcttt aagccaagcc aagcgtaact taactgatgt cagcgtgaat gagtgttttg 22680 
agaacctcaa aagtgaacag cagttcacag aggtttattc gcttattcag caacttgcta 22740 
gccgcaccca tgtgagaaaa gaggttaatc aaggtgtgga acttggccct aaacaagcca 22800 
aaagccacta ttggtttagc gaatttcacc aaaaccgtgt tgctgccatc aactttatta 22860 
atggccaaca agcaaccagc tatgtgctta ctcaaggttc aggattgtta gctgcgaaat 22920 
caatgctaaa ccagcaaaga tcaatgttta tcttgccggg taacagtcag caacaaataa 22980 
ccgcatcaat aactcagtta atgcagcaat tagagcgttt gcaggtaact gaggttaatg 2 3040 
agctttctct agaatgccaa ctagagctgc tcagcataat gtatgacaac ttagtcaacg 23100 
cagacaaact cactactcgc gatagtaagc ccgcttatca ggctgtgatt caagcaagct 23160 
ctgttagcgc tgcaaagcaa gagttaagcg cgcttaacga tgcactcaca gcgctgtttg 2 3220 
ctgagcaaac aaacgccaca tcaacgaata aaggcttaat ccaatacaaa acaccggcgg 23280 
gcagttactt aaccctaaca ccgcttggca gcaacaatga caacgcccaa gcgggtcttg 23340 
cttttgtcta tccgggtgtg ggaacggttt acgccgatat gcttaatgag ctgcatcagt .23400 
acttccctgc gctttacgcc aaacttgagc gtgaaggcga tttaaaggcg atgctacaag 23460 
cagaagatat ctatcatctt gaccctaaac atgctgccca aatgagctta ggtgacttag 23520 
ccattgctgg cgtggggagc agctacctgt taactcagct gctcaccgat gagtttaata 23580 
ttaagcctaa ttttgcatta ggttactcaa tgggtgaagc atcaatgtgg gcaagcttag 23640 
gcgtatggca aaacccgcat gcgctgatca gcaaaaccca aaccgacccg ctatttactt 23700 
ctgctatttc cggcaaattg accgcggtta gacaagcttg gcagcttgat gataccgcag 23760 
cggaaatcca gtggaatagc tttgtggtta gaagtgaagc agcgccgatt gaagccttgc 23820 
taaaagatta cccacacgct tacctcgcga ttattcaagg ggatacctgc gtaatcgctg 23880 
gctgtgaaat ccaatgtaaa gcgctacttg cagcactggg taaacgcggt attgcagcta 23940 
atcgtgtaac ggcgatgcat acgcagcctg cgatgcaaga gcatcaaaat gtgatggatt 24000 
tttatctgca accgttaaaa gcagagcttc ctagtgaaat aagctttatc agcgccgctg 24060 
atttaactgc caagcaaacg gtgagtgagc aagcacttag cagccaagtc gttgctcagt 24120 
ctattgccga caccttctgc caaaccttgg actttaccgc gctagtacat cacgcccaac 24180 
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atcaaggcgc taagctgttt gttgaaattg gcgcggatag acaaaactgc accttgatag 24240 
acaagattgt taaacaagat ggtgccagca gtgtacaaca tcaaccttgt tgcacagtgc 24300 
ctatgaacgc aaaaggtagc caagatatt'a ccagcgtgat taaagcgctt ggccaattaa 24360 
ttagccatca ggtgccatta tcggtgcaac catttattga tggactcaag cgcgagctaa 24420 
^^^^t^tgcca attgaccagc caacagctgg cagcacatgc aaatgttgac agcaagtttg 24480 
agtctaacca agaccattta cttcaagggg aagtctaatg tcattaccag acaatgcttc 2 4540 
taaccacctt tctgccaacc agaaaggcgc atctcaggca agtaaaacca gtaagcaaag 24600 
caaaatcgcc attgtcggtt tagccactct gtatccagac gctaaaaccc cgcaagaatt 24660 
ttggcagaat ttgctggata aacgcgactc tcgcagcacc ttaactaacg aaaaactcgg 2 4720 
cgctaacagc caagattatc aaggtgtgca aggccaatct gaccgttttt attgtaataa 24780 
aggcggctac attgagaact tcagctttaa tgctgcaggc tacaaattgc cggagcaaag 24840 
cttaaatggc ttggacgaca gcttcctttg ggcgctcgat actagccgta acgcactaat 24900 
tgatgctggt attgatatca acggcgctga tttaagccgc gcaggtgtag tcatgggcgc 24960 
gctgtcgttc ccaactaccc gctcaaacga tctgtttttg ccaatttatc acagcgccgt 25020 
tgaaaaagcc ctgcaagata aactaggcgt aaaggcattt aagctaagcc caactaatgc 25080 
tcataccgct cgcgcggcaa atgagagcag cctaaatgca gccaatggtg ccattgccca 25140 
taacagctca aaagtggtgg ccgatgcact tggccttggc ggcgcacaac taagcctaga 25200 
tgctgcctgt gctagttcgg tttactcatt aaagcttgcc tgcgattacc taagcactgg 25260 
caaagccgat atcatgctag caggcgcagt atctggcgcg gatcctttct ttattaatat 25320 
gggattctca atcttccacg cctacccaga ccatggtatc tcagtaccgt ttgatgccag 25380 
cagtaaaggt ttgtttgctg gcgaaggcgc tggcgtatta gtgcttaaac gtcttgaaga 25440 
tgccgagcgc gacaatgaca aaatctatgc ggttgttagc ggcgtaggtc tatcaaacga 25500 
cggtaaaggc cagtttgtat taagccctaa tccaaaaggt caggtgaagg cctttgaacg 25560 
tgcttatgct gccagtgaca ttgagccaaa agacattgaa gtgattgagt gccacgcaac 25620 
aggcacaccg cttggcgata aaattgagct cacttcaatg gaaaccttct ttgaagacaa 25680 
gctgcaaggc accgatgcac cgttaattgg ctcagctaag tctaacttag gccacctatt 25740 
aactgcagcg catgcgggga tcatgaagat gatcttcgcc atgaaagaag gttacctgcc 25800 
gccaagtatc aatattagtg. atgctatcgc ttcgccgaaa aaactcttcg gtaaaccaac 25860 
cctgcctagc atggttcaag gctggccaga taagccatcg aataatcatt ttggtgtaag 25920 
aacccgtcac gcaggcgtat cggtatttgg ctttggtggc tgtaacgccc atctgttgct 25980 
tgagtcatac aacggcaaag gaacagtaaa ggcagaagcc actcaagtac cgcgtcaagc 26040 
tgagccgcta aaagtggttg gccttgcctc gcactttggg cctcttagca gcattaatgc 26100 
actcaacaat gctgtgaccc aagatgggaa tggctttatc gaactgccga aaaagcgctg 26160 
gaaaggcctt gaaaagcaca gtgaactgtt agctgaattt ggcttagcat ctgcgccaaa 26220 
aggtgcttat gttgataact tcgagctgga ctttttacgc tttaaactgc cgccaaacga. 26280 
agatgaccgt ttgatctcac agcagctaat gctaatgcga gtaacagacg aagccattcg 26340 
tgatgccaag cttgagccgg ggcaaaaagt agctgtatta gtggcaatgg aaactgagct 26400 
tgaactgcat cagttccgcg gccgggttaa cttgcatact caattagcgc aaagtcttgc 264 60 
cgccatgggc gtgagtttat caacggatga ataccaagcg cttgaagcca tcgccatgga 26520 
cagcgtgctt gatgctgcca agctcaatca gtacaccagc tttattggta atattatggc 26580 
gtcacgcgtg gcgtcactat gggactttaa tggcccagcc ttcactattt cagcagcaga 26640 
gcaatctgtg agccgctgta tcgatgtggc gcaaaacctc atcatggagg ataacctaga 26700 
tgcggtggtg attgcagcgg tcgatctctc tggtagcttt gagcaagtca ttcttaaaaa 26760 
tgccattgca cctgtagcca ttgagccaaa cctcgaagca agccttaatc caacatcagc 26820 
aagctggaat gtcggtgaag gtgctggcgc ggtcgtgctt gttaaaaatg aagctacatc 26880 
gggctgctca tacggccaaa ttgatgcact tggctttgct aaaactgccg aaacagcgtt 26940 
ggctaccgac aagctactga gccaaactgc cacagacttt aataaggtta aagtgattga 27000 
aactatggca gcgcctgcta gccaaattca attagcgcca atagttagct ctcaagtgac 27060 
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tcacactgct gcagagcagc gtgttggrca ctgctttgct gcagcgggta tggcaagcct 27120 
attacacggc ttacttaact taaatactgt agcccaaacc aataaagcca attgcgcgct 27180 
tatcaacaat atcagtgaaa accaattatc acagctgttg attagccaaa cagcgagcga 27240 
acaacaagca ttaaccgcgc gtttaagcaa tgagcttaaa tccgatgcta aacaccaact 27300 
ggttaagcaa gtcaccttag gtggccgtga tatctaccag catattgttg atacaccgct 27360 
tgcaagcctt gaaagcatta ctcagaaatt ggcgcaagcg acagcatcga cagtggtcaa 27420 
ccaagttaaa cctattaagg ccgctggctc agtcgaaatg gctaactcat tcgaaacgga 27480 
aagctcagca gagccacaaa taacaattgc agcacaacag actgcaaaca ttggcgtcac 7540 
cgctcaggca accaaacgtg aattaggtac cccaccaatg acaacaaata ccattgctaa 27600 
tacagcaaat aatttagaca agactcttga gactgttgct ggcaatactg ttgctagcaa 27660 
ggttggctct ggcgacatag tcaattttca acagaaccaa caattggctc aacaagctca 27720 
cctcgccttt cttgaaagcc gcagtgcggg tatgaaggtg gctgatgctt tattgaagca 277 0 
acagctagct caagtaacag gccaaactat cgataatcag gccctcgata ctcaagccgt 7840 
cgatactcaa acaagcgaga atgtagcgat tgccgcagaa tcaccagttc aagttacaac 27900 
acctgttcaa gttacaacac ctgttcaaat cagtgttgtg gagttaaaac cagatcacgc 27960 
taatgtgcca ccatacacgc cgccagtgcc tgcattaaag ccgtgtatct ggaactatgc 8020 
cgatttagtt gagtacgcag aaggcgatat cgccaaggta tttggcagtg attatgccat 28080 
tatcgacagc tactcgcgcc gcgtacgtct accgaccact gactacctgt tggtatcgcg 28140 
cgtgaccaaa cttgatgcga ccatcaatca atttaagcca tgctcaatga ccactgagta 28200 
cgacatccct gttgatgcgc cgtacttagt agacggacaa atcccttggg cggtagcagt 28260 
agaatcaggc caatgtgact tgatgcttat tagctatctc ggtatcgact ttgagaacaa 28320 
aggcgagcgg gtttatcgac tactcgattg taccctcacc ttcctaggcg acttgccacg 283 0 
tggcggagat acoctacgtt acgacattaa gatcaataac tatgctcgca acggcgacac 28440 
cdtgctgttc ttcttctcgt atgagtgttt tgttggcgac aagatgatcc tcaagatgga 28500 
tggcggctgc gctggcttct tcactgatga agagcttgcc gacggtaaag gcgtgattcg 28560 
cacaoaagaa gagattaaag ctcgcagcct agtgcaaaag caacgcttta atccgttact 28620 
agattgtcct aaaacccaat ttagttatgg tgatattcat aagctattaa ctgctgatat 28680 
tgagggttgt tttggcccaa gccacagtgg cgtccaccag ccgtcacttt gtttcgcatc 28740 
tgaaaaattc ttgatgattg aacaagtcag caaggttgat cgcactggcg gtacttgggg 28800 
acttggctta attgagggtc ataagcagct tgaagcagac cactggtact tcccatgtca 28860 
tttcaagggc gaccaagtga tggctggctc gctaatggct gaaggttgtg gccagttatt 28920 
gcagttctat atgctgcacc ttggtatgca tacccaaact aaaaatggtc gtttccaacc 28980 
tcttgaaaac gcctcacagc aagtacgctg tcgcggtcaa gtgctgccac aatcaggcgt 29040 
gctaacttac cgtatggaag tgactgaaat cggtttcagt ccacgcccat atgctaaagc 29100 
taacatcgat atcttgctta atggcaaagc ggtagtggat ttccaaaacc taggggtgat. 29160 
gataaaagag gaagatgagt gtactcgtta tccacttttg actgaatcaa caacggctag 29220 
cactgcacaa gtaaacgctc aaacaagtgc gaaaaaggta tacaagccag catcagtcaa 29280 
tgcgccatta atggcacaaa ttcctgatct gactaaagag ccaaacaagg gcgttattcc 29340 
gatttcccat gttgaagcac caattacgcc agactacccg aaccgtgtac ctgatacagt 29400 
gccattcacg ccgtatcaca tgtttgagtt tgctacaggc aatatcgaaa actgtttcgg 29460 
gccagagttc tcaatctatc gcggcatgat cccaccacgt acaccatgcg gtgacttaca 29520 
agtgaccaca cgtgtgattg aagttaacgg taagcgtggc gactttaaaa agccatcatc 29580 
gtgtatcgcc gaatatgaag tgcctgcaga tgcgtggtat ttcgataaaa acagccacgg 29640 
cgcagtgatg ccatattcaa ttttaatgga gatctcactg caacctaacg gctttatctc 29700 
aggttacatg ggcacaaccc taggcttccc tggccttgag ctgttcttcc gtaacttaga 29760 
cggtagcggt gagttactac gtgaagtaga tttacgtggt aaaaccatcc gtaacgactc 29820 
acgtttatta tcaacagtga tggccggcac taacatcatc caaagcttta gcttcgagct 29880 
aagcactgac ggtgagcctt tctatcgcgg cactgcggta tttggctatt ttaaaggtga 29940 
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cgcacttaaa gatcagctag gcctagataa cggtaaagtc actcagccat ggcatgtagc 30000 
taacggcgtt gctgcaagca ctaaggtgaa cctgcttgat aagagctgcc gtcactttaa 30060 
tgcgccagct aaccagccac actatcgtct agccggtggt cagctgaact ttatcgacag 30120 
tgttgaaatt gttgataatg gcggcaccga aggtttaggt tacttgtatg ccgagcgcac 30180 
cattgaccca agtgattggt tcttccagtt ccacttccac caagatccgg ttatgccagg 30240 
ctccttaggt gttgaagcaa ttattgaaac catgcaagct tacgctatta gtaaagactt 30300 
gggcgcagat ttcaaaaatc ctaagtttgg tcagatttta tcgaacatca agtggaagta 30360 
tcgcggtcaa atcaatccgc tgaacaagca gatgtctatg gatgtcagca ttacttcaat 30420 
caaagatgaa gacggtaaga aagtcatcac aggtaatgcc agcttgagta aagatggtct 30480 
gcgcatatac gaggtcttcg atatagctat cagcatcgaa gaatctgtat aaatcggagt 30540 
gactgtctgg ctattttact caatttctgt gtcaaaagtg ctcacctata ttcataggct 30600 
gcgcgctttt ttctggaaat tgagcaaaag tatctgcgtc ctaactcgat ttataagaat 30660 
ggtttaattg aaaagaacaa cagctaagag ccgcaagctc aatataaata attaagggtc 30720 
ttacaaataa tgaatcctac agcaactaac gaaatgcttt ctccgtggcc atgggctgtg 30780 
acagagtcaa atatcagttt tgacgtgcaa gtgatggaac aacaacttaa agattttagc 30840 
cgggcatgtt acgtggtcaa tcatgccgac cacggctttg gtattgcgca aactgccgat 30900 
atcgtgactg aacaagcggc aaacagcaca gatttacctg ttagtgcttt tactcctgca 30960 
ttaggtaccg aaagcctagg cgacaataat ttccgccgcg ttcacggcgt taaatacgct 31020 
tattacgcag gcgctatggc aaacggtatt tcatctgaag agctagtgat tgccctaggt 31080 
caagctggca ttttgtgtgg ttcgtttgga gcagccggtc ttattccaag tcgcgttgaa 31140 
gcggcaatta accgtattca agcagcgctg ccaaatggcc cttatatgtt taaccttatc 31200 
catagtccta gcgagccagc attagagcgt ggcagcgtag agctattttt aaagcataag 31260 
gtacgcaccg ttgaagcatc agctttctta ggtctaacac cacaaatcgt ctattaccgt 31320 
gcagcaggat tgagccgaga cgcacaaggt aaagttgtgg ttggtaacaa ggttatcgct 31380 
aaagtaagtc gcaccgaagt ggctgaaaag tttatgatgc cagcgcccgc aaaaatgcta 31440 
caaaaactag ttgatgacgg ttcaattacc gctgagcaaa tggagctggc gcaacttgta 31500 
cctatggctg acgacatcac tgcagaggcc gattcaggtg gccatactga taaccgtcca 31560 
ttagtaacat tgctgccaac cattttagcg ctgaaagaag aaattcaagc taaataccaa 31620 
tacgacactc ctattcgtgt cggttgtggt ggcggtgtgg gtacgcctga tgcagcgctg 31680 
gcaacgttta acatgggcgc ggcgtatatt gttaccggct ctatcaacca agcttgtgtt 31740 
gaagcgggcg caagtgatca cactcgtaaa ttacttgcca ccactgaaat ggccgatgtg 31800 
actatggcac cagctgcaga tatgttcgag atgggcgtaa aactgcaggt ggttaagcgc 31860 
ggcacgctat tcccaatgcg cgctaacaag ctatatgaga tctacacccg ttacgattca 31920 
atcgaagcga tcccattaga cgagcgtgaa aagcttgaga aacaagtatt ccgctcaagc 31980 
ctagatgaaa tatgggcagg tacagtggcg cactttaacg agcgcgaccc taagcaaatc, 32040 
gaacgcgcag agggtaaccc taagcgtaaa atggcattga ttttccgttg gtacttaggt 32100 
ctttctagtc gctggtcaaa ctcaggcgaa gtgggtcgtg aaatggatta tcaaatttgg 32160 
gctggccctg ctctcggtgc atttaaccaa tgggcaaaag gcagttactt agataactat 32220 
caagaccgaa atgccgtcga tttggcaaag cacttaatgt acggcgcggc ttacttaaat 32280 
cgtattaact cgctaacggc tcaaggcgtt aaagtgccag cacagttact tcgctggaag 32340 
ccaaaccaaa gaatggccta atacacttac aaagcaccag tctaaaaagc cactaatctt 32400 
gattagtggc tttttttatt gtggtcaata tgaggctatt tagcctgtaa gcctgaaaat 32460 
atcagcactc tgactttaca agcaaattat aattaaggca gggctctact catttatact 32520 
gctagcaaac aagcaagttg cccagtaaaa caacaaggta cctgatttat atcgtcataa 32580 
aagttggcta gagattcgtt attgatcttt actgattaga gtcgctctgt ttggaaaaag 32640 
gtttctcgtt atcatcaaaa tacactctca aacctttaat caattacaac ttaggctttc 32700 
tgcgggcatt tttatcttat ttgccacagc tgtatttgcc tttaggtttt gggtgcaact 32760 
accattaatt gaggcctcat tagttaaatt atctgagcaa gagctcacct ctttaaatta 32820 
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cgcttttcag caaatgagaa agccactaca- aaccattaat tacgactatg cggtgtggga 32880 
cagaacctac agctatatga aatcaaactc agcgagcgct aaaaggtact atgaaaaaca 32940 
tgagtaccca gatgatacgt tcaagagttt aaaagtcgac ggagtattta tattcaaccg 3 000 
tacaaatcag ccagttttta gtaaaggttt taatcataga aatgatatac cgctggtctt 33060 
tgaattaact gactttaaac aacatccaca aaacatcgca ttatctccac aaaccaaaca 33120 
ggcacaccca ccggcaagta agccgttaga ctcccctgat gatgtgcctt ctacccatgg 33180 
ggttatcgcc acacgatacg gtccagcaat ttatagctct accagcattt taaaatctga 33240 
tcgtagcggc tcccaacttg gttatttagt cttcattagg ttaattgatg aatggttcat 33300 
cgctgagcta tcgcaataca ctgccgcagg tgttgaaatc gctatggctg atgccgcaga 33360 
cgcacaatta gcgagattag gcgcaaacac taagcttaat aaagtaaccg ctacatccga 33420 
acggttaata actaatgtcg atggtaagcc tctgttgaag ttagtgcttt accataccaa 33480 
taaccaaccg ccgccgatgc tagattacag tataataatt ctattagttg agatgtcatt 33540 
tttactgatc ctcgcttatt tcctttactc ctacttctta gtcaggccag ttagaaagct 33600 
ggcttcagat attaaaaaaa tggataaaag tcgtgaaatt aaaaagctaa ggtatcacta 33660 
ccctattact gagctagtca aagttgcgac tcacttcaac gccctaatgg ggacgattca 33720 
ggaacaaact aaacagctta atgaacaagt ttttattgat aaattaacca atattcccaa 33780 
tcgtcgcgct tttgagcagc gacttgaaac ctattgccaa ctgctagcoc ggcaacaaat 33840 
tggctttact ctcatcattg ccgatgtgga tcattttaaa gagtacaacg atactcttgg 33900 
gcaccttgct ggggatgaag cattaataaa agtggcacaa acactatcgc aacagtttta 33960 
ccgtgcagaa gatatttgtg cccgttttgg tggtgaagaa tttattatgt tatttcgaga 34020 
catacctgat gagcccttgc agagaaagct cgatgcgatg ctgcactctt ttgcagagct 34080 
caacctacct catccaaact catcaaccgc taattacgtt actgtgagcc ttggggtttg 34140 
cacagttgtt gctgttgatg attttgaatt taaaagtgag tcgcatatta ttggcagtca 34200 
ggctgcatta atcgcagata aggcgcttta tcatgctaaa gcctgtggtc gtaaccagtt 34260 
gtcaaaaact actattactg ttgatgagat tgagcaatta gaagcaaata aaatcggtca 34320 
tcaagcctaa actcgttcga gtactttccc ctaagtcaga gctatttgcc acttcaagat 34380 
gtggctacaa ggcttactct ttcaaaacct gcatcaatag aacacagcaa aatacaataa 34440 
tttaagtcaa tttagcctat taaacagagt taatgacagc tcatggtcgc aacttattag 34500 
ctatttctag caatataaaa acttatccat tagtagtaac caataaaaaa actaatatat 34560 
aaaactattt aatcattatt ttacagatga ttagctacca cccaccttaa gctggctata 34620 
ttcgcactag taaaaataaa cattagatcg ggttcagatc aatttacgag tctcgtataa 34680 
aatgtacaat aattcactta atttaatact gcatattttt acaagtagag agcggtgatg 34740 
aaacaaaata cgaaaggctt tacattaatt gaattagtca tcgtgattat tattctoggt 34800 
atacttgctg ctgtggcact gccgaaattc atcaatgttc aagatgaogc taggatctct 34860 
gcgatgagcg gtcagttttc atcatttgaa agtgccgtaa aactatacca tagcggttgg.34920 
ttagccaaag gctacaacac tgcggttgaa aagctctcag gctttggcca aggtaatgtt 34980 
gcatcaagtg acacaggttt tccgtactca acatcaggca cgagtactga tgtgcataaa 35040 
gcttgtggtg aactatggca tggcattacc gatacagact tcacaattgg tgcggttagt 35100 
gatggcgatc taatgactgc agatgtcgat attgcttaca cctatcgtgg tgatatgtgt 35160 
atctatcgcg atctgtattt tattcagcgc tcattaccta ctaaggtgat gaactacaaa 35220 
tttaaaactg gtgaaataga aattattgat gctttctaca accctgacgg ctcaactggt 35280 
caattaccat aaatttggcg cttatctaag ttgtacttgc tctgaccgac acaaataatg 35340 
tcgtttctca gcatatatca aaatacacag caaaaatttg gggttagcta tatagctaac 35400 
cccaaatcat atctaacttt acactgcatc taattccaaa cagtatccag ccaaaagcct 35460 
aaactattgt tgactcagcg ctaaaatatg cgatgcaaca aacaagtctt ggatcgcaat 35520 
acctgagcta tcaaaaatgg tcacctcatc agcactttga cgtcctgttg cggactcgtt 35580 
tatcacctga ccaatctcaa ttatcggcgt atttctgcta tgttgaaact caccaataac 35640 
aatagattga gaagcaaagt cgcaaaacaa gcgagcatga ctatataggt cagttggcaa 35700 
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ctcttgctta cccactttat cagcgcccat tgcagaaata tgcgttcctg cttgtaccca 35760 
ctgcgcttca aataaaggcg cttgagctgt ggttgctgtg ataataatat ctgcttgttc 358 0 
acaagcagct tgtgcatcac aagcttcggd attaatgcct ttttctaata aacgcttaac 35880 
caagttttca gttttgctag cactacggcc aactaccaat accttagtta atgaacgaac 5940 
cttgctcact gctagcactt catattcagc ctgatgaccg gtaccaaaaa cagttaatac 36000 
cgtagcatct tctctcgcga ggtaactcac tgctactgca tcggcagcac cagtgcggta 36060 
agcattaacg gtagtggcag caatcaccgn ctgcaacata ccggttaatg gatcgagtaa 6 0 
a!atacgtta gtgccgtggc atggtaaacc a.gtttatgg ttatcaggcc aa agctgc^c 
tgttttccag ccgacaaggt ttggcgttga agccgacttt aatgagaaca tttcattaag 6240 
gttcgcgccc tgtgcattaa ctaccgggaa caaggttgct ttatcatcta cggcagcgac 6300 
aaacgc!tct ttaacagcga tataagccag ctcatgggag atgagctttg atgtttgcgc 6 0 
ttcagttaaa tagatcatat taccacccct gcactcgatt ccagatctca tagccaccat 6 0 
tatcaccatc agtatcaaat acatggtact gagcgtgcat tgaagctgtt gcacaggcgt 6480 
ggttcggcaa aatatgtaga cgactaccta ccgggaactg cgctaaatca ataacgccgc 36540 
catca!!tgc ttcaataatg ccgtgctctt gattaacagt tataacctgt agacctgata 6600 
acacgtgacc gctgtcgtca cacactaaac cataaccaca atcttttggc tgctctgcag 36660 
taccLtatc acccgaaaga gccatccaac ccgcatcaat gaaaatccag tttttatcag 6720 
gattatgacc aataacactg gtcactaccg ttgcggcaat atcagttaac tgacacacgt 36780 
ttagccctgc catgactaaa tcgaagaagg tgtacacacc cgctctaacc tcggtgatcc 36840 
catcaaggtt ttgatagctt tgcgctgttg gtgttgaacc aatactaacg atgtcacatt 36900 
gcatacccgc tgcgcgaatg cgtcagcagc ttgtacagcc gctgcaactt cattttgcgc 36960 
cg£atcaatt aattgctgtt tttcaaaaca ttgatatgac tcaccagcgt gagtnagtac 37020 
gccgtgaaaa ctcgctgcgc cagacgttag tatctgagca atttcaatca acttatcggc 370 0 
ttccggtgga ataccaccac gatggccatc acaatcaatt tcaattaatg ctggtatttg 37 40 
gcagtcataa gaaccacaga aatgatttag ctgatgcgct tgctcaacac tatcaagtaa 37200 
aactcttgca ttaatacctt ggtccaacat tttagcaata cgcggcaact taccatcggc 372 0 
aatacctact gcataaataa tgtctgtgta acctttagat gctaaggcct cggcctcttt 37320 
taccgttgat acagtgactg gtgagttttt agtgggtaat aaaaactcgg ctgcttcaag 37 80 
tgatcttaac gttttaaaat gcggtcttag gtttgcacct aatccttcaa ttttttggcg 3 440 
tagttgactg aggttattaa taaatactgg cttatttaca tataaaaacg gtgtatcaat 37500 
tgcttgatac tgactttgct gagtcgtgga aagtatttga gtagatggca tctttaatat 37560 
. cctagttcat caatcaatct aacaagtttg atgcctagcc acagtggctt gtattcatga 37620 
tgctttggaa aatgcttata ttcaaagtat ttgaaagaca tcaaacttct tgtttaatgc 37680 
tcagtatcca ccagcacgca tttattttat attaactatt atcaagatat agattaggtt 37740 
caaaccaaat gattagtact gaagatctac gttttatcag cgtaatcgcc agtcatcgca. 37800 
ccttagctga tgccgctaga acactaaata tcacgccaoc atcagtgaca ttaaggttgc 37860 
agcatattga aaagaaacta tcgattagcc tgatc 

<210> 2 
<211> 654 
<212> PRT 

<213> Shewanella putrefaciens 



<400> 2 

Met Lys Gin Thr Leu Met Ala He Ser He Met Ser Leu Phe Ser Phe 
1 



5 10 15 



Asn Ala Leu Ala Ala Gin His Glu His Asp His lie Thr Val Asp Tyr 
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20 25 30 

Glu Gly Lys Ala Ala Thr Glu His 'Thr He Ala His Asn Gin Ala Val 
35 40 45 

Ala Lys Thr Leu Asn Phe Ala Asp Thr Arg Ala Phe Glu Gin Ser Ser 
50 " 60 

Lys Asn Leu Val Ala Lys Phe A.p Lys Ala Thr Ala Asp He Leu Arg 
65 70 80 

Ala Glu Phe Ala Phe He Ser Asp Glu He Pro Asp Ser Val Asn Pro 
85 90 55 

Ser Leu Tyr Arg Gin Ala Gin Leu Asn Met Val Pro Asn Gly Tyr Lys 
100 105 110 

Val Ser Asp Gly He Tyr Gin Val Arg Gly Thr Asp Leu Ser Asn Leu 
115 120 125 

Thr Leu He Arg Ser Asp Asn Gly Trp He Ala Tyr Asp Val Leu Leu 
130 135 140 

Thr Lys Glu Ala Ala Lys Ala Ser Leu Gin Phe Ala Leu Lys Asn Leu 
145 150 155 160 

Pro Lys Asp Gly Asp Pro Val Val Ala Met He Tyr Ser His Ser His 
165 170 175 

Ala Asp His Phe Gly Gly Ala Arg Gly Val Gin Glu Met Phe Pro Asp 
180 185 190 

Val Lys Val Tyr Gly Ser Asp Asn He Thr Lys Glu He Val Asp Glu 
195 200 205 

Asn Val Leu Ala Gly Asn Ala Met Ser Arg Arg Ala Ala Tyr Gin Tyr 
210 215 220 

Gly Ala Thr Leu Gly Lys His Asp His Gly He Val Asp Ala Ala Leu 
225 230 235 240 

Gly Lys Gly Leu Ser Lys Gly Glu He Thr Tyr Val Ala Pro Asp Tyr 
245 250 255 

Thr Leu Asn Ser Glu Gly Lys Trp Glu Thr Leu Thr He Asp Gly Leu 
260 265 270 

Glu Met Val Phe Met Asp Ala Ser Gly Thr Glu Ala Glu Ser Glu Met 
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275 280 285 

lie Thr Tyr lie Pro Ser Lys Lys "Ala Leu Trp Thr Ala Glu Leu Thr 
290 295 300 



Tyr 



Gin Gly Met His Asn He Tyr Thr Leu Arg Gly Ala Lys Val Arg 
305 310 315 320 

Asp Ala Leu Lys Trp Ser Lys Asp He Asn Glu Met He Asn Ala Phe 
325 330 335 

Gly Gin Asp Val Glu Val Leu Phe Ala Ser His Ser Ala Pro Val Trp 
340 345 350 

Gly Asn Gin Ala He Asn Asp Phe Leu Arg Leu Gin Arg Asp Asn Tyr 
355 360 365 

Gly Leu Val His Asn Gin Thr Leu Arg Leu Ala Asn Asp Gly Val Gly 
370 375 380 

He Gin Asp lie Gly Asp Ala He Gin Asp Thr He Pro Glu Ser He 
383 390 395 400 

Tyr Lys Thr Trp His Thr Asn Gly Tyr His Gly Thr Tyr Ser His Asn 
405 410 415 

Ala Lys Ala Val Tyr Asn Lys Tyr Leu Gly Tyr Phe Asp Met Asn Pro 
420 425 430 

Ala Asn Leu Asn Pro Leu Pro Thr Lys Gin Glu Ser Ala Lys Phe Val 
435 440 445 

Glu Tyr Met Gly Gly Ala Asp Ala Ala He Lys Arg Ala Lys Asp Asp 
450 455 460 

Tyr Ala Gin Gly Glu Tyr Arg Phe Val Ala Thr Ala Leu Asn Lys Val 
465 470 475 480 

Val Met Ala Glu Pro Glu Asn Asp Ser Ala Arg Gin Leu Leu Ala Asp 
485 490 495 

Thr Tyr Glu Gin Leu Gly Tyr Gin Ala Glu Gly Ala Gly Trp Arg Asn 
SCO 505 510 

He Tyr Leu Thr Gly Ala Gin Glu Leu Arg Val Gly He Gin Ala Gly 
515 520 525 

Ala Pro Lys Thr Ala Ser Ala Asp Val He Ser Glu Met Asp Met Pro 
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530 535 540 

Thr Leu Phe Asp Phe Leu Ala Val lys lie Asp Ser Gin Gin Ala Ala 



545 



550 555 560 

Lvs His Gly Leu Val Lys Met Asn Val He Thr Pro Asp Thr Lys Asp 
565 570 575 

He Leu Tyr He Glu Leu Ser Asn Gly Asn Leu Ser Asn Ala Val Val 
580 585 590 

Asp Lys Glu Gin Leu Met Val Asn Lys Ala Asp Val Asn Arg He Leu 
595 600 605 

Leu Gly Gin Val Thr Leu Lys Ala Leu Leu Ala Ser Gly Asp Ala Lys 
610 615 620 



Leu Thr Gly Asp Lys Thr Ala Phe Ser Lys He Ala Asp Ser Met Val 
625 



630 635 640 



Glu Phe Thr Pro Asp Phe Glu He Val Pro Thr Pro Val Lys 
645 650 



<210> 3 
<2H> 277 
<212> PRT 

<213> Shewanella putrefaciens 
<400> 3 

Ser Thr Lys Ala Ser Ala Arg Val Val Ala Lys Phe Asn Val Glu Glu 
1 5 10 15 

Ala Ala He Ser He Gin Gin Cys Gin Gly He Ser Leu Ala Phe Arg 
20 25 30 

Tyr Ser Asp Asp Leu His Gly Leu Leu Cys His Trp Asn Asp Ala Ala 
35 40 45 

Asn Met Gin Gin Glu Lys Ala Glu He Leu Gly Leu Gly Ser Lys Gin 
50 55 60 

Pro Glu Ala Asn Pro Lys Asn Ser Ser Ser Glu Leu Leu Ala Leu Gly 
65 -10 75 80 

He Asp Gin Lys Leu Leu Val Gin Arg Gin Asn Leu Gin His Glu Val 
85 90 95 
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Lys His Asp Ala He Ala Asp Ser He Asp Val Cys His Ser Leu Ser 
100 105 110 

Lys Pro Ala Asn Val Gly Leu Phe Thr Glu Ser Leu Ala Ser Phe Asp 
115 120 125 

Phe Ala Phe Ser Lys Leu Ser Leu Ala Leu Gly Leu Gly Lys Ala Lys 
130 135 

He Tyr Ser Glu Lys Leu Ala Tr? Leu Asp Phe Phe Arg Asp Arg Gin 
145 150. 155 160 

Leu Ala Glu Pro Leu Ala Leu Leu Ala Arg Lys Glu Ser Glu Ser Phe 
165 I'^O l''^ 

Tyr His Ser Leu He Ser His He Asn Thr Ser Asn Arg Cys Arg Glu 
180 185 190 

He Asp Val Gly Phe Glu He Ser Ala Ser Asp Thr Glu Glu Lys Ser 
195 200 205 

Aia Gin Ser Ala Gly Lys Asn Asp Ala Thr Cys He Gly Val Leu Leu 
210 215 220 

Trp Asp Gly Ser His Ser' Val Asn Phe His Val Gly Thr Gin Ala Phe 
225 230 235 240 

Gin Ala ASD Ser Leu Arg Pro Lys Gly Lys Asp Gly Tyr Glu Phe Arg 
245 250 255 

Trp Glu Asn Pro Arg He Glu Ser His Gin Ser Leu Leu Ala Arg Leu 
260 265 270 

Tyr Gly Arg Val Met 
275 



<210> 4 
<211> 1480 
<212> DNA 

<213> Shewanella putrefaciens 
<400> 4 

gctagtctta gctgasrthr ysaasragct 
gctgcaatac ttatttgctg acactgacca 
agatggaaar gvavaaaysh asnvaggaaa 
shscccagta aacaatgcca attatcagca 
aaacctaaac cagacttttg tggctcagcg 



cgaacaacag ctttaaaatt cacttcttct 60 
atactcagtg caaaacgata actatcatca 120 
asrgngncys gngysraaha rgtyrsrasa 180 
gcgttcattt gctgttcttt agcctcaatc 240 
ttaggcttat taggycyshs trasnasaaa 300 
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aasnmtgngn gysaaggygy srysgnrgaa 
aagaccaata tcttgtttta acaaaacctg 
atccgcaata gcatcggaaa tsrsrgaagy 
saaaaassra tcaacacaat ggctcaagct 
gctcagtgca gagaagtcaa acgcaaaaga 
srysraaasn vagyhthrgs raasrhasha 
gtaagactcc ttgagcgccc acaaatcaaa 
cgctaacaag gctcgctttt gygyysaays 
aaaaargysg ctgattcaga gaaa^^aatga 
cggcaacgct caatgtcgac gccaaactca 
ssrsrhsasn thrsrasnar gcysarggas 
ctgactggcg cctttattat cagcagtgca 
actcacatta aagtggaccc cggtttgagy 
vatrasgysr hssrvaasnh hsvagythrg 
acctttgtcg ccatattcaa agcgccatrc 
taaagcgcgc aaahgnaaas srargrysgy 
nsraaargaa tagcctctta ccattaaacc 
attaacctta attaactcat cttcaggcag 
gygnthrysa aggnystyra rgasnvaysg 
actcttgtat tgttaacgga cagaagtata 
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asnrysasns raactcgact ctagcaaagc 360 
tcgctgatta agttgatgct caaccttgtg 420 
asgnysvagn arggnasngn hsgvayshsa 480 
tttaggtgca ttaactccaa gaaaagtttc 540 
ttttagcgat aatgccagca svacyshssr 600 
ahsryssraa ccaagtcctt tcgctttaat 650 
aaagcggtct cgctgcaagg cctctggtaa 720 
tyrsrgysaa trashharga sarggnaagr 780 
ctaagaatag agtggatatt ggtgctgtta 840 
atactagcag agtcagtttc srgsrhtyrh 900 
vagyhgsraa srasthrgct ccttgcttgc 960 
aatgcctact aatagccaat ctccactatg 1020 
ssraagnsra agyysasnas aathrcysgy 1080 
ngcaaattgc gcatcactca atctaggctt 1140 
attggggcgt atttcactat gttgtgacaa 1200 
ysasgytyrg hargtrgasn rarggsrhsg 1260 
ttgagtttta gcttcttgtt taatgtagcg 1320 
ccatgactta accaactcty rgyargvamt 1380 
asgrtrsrys vagtgtagtc tggttatcgc 1440 
aggaaatcaa 1480 



<21-0> 5 
<211> 970 
<212> PRT 

<213> Shewanella putrefaciens 



<400> 5 

Met Ser Met Phe Leu Asn Ser Lys Leu Set Arg Ser Val Lys Leu Ala 
15 10 15 

He Ser Ala Gly Leu Thr Ala Ser Leu Ala Met Pro Val Phe Ala Giu 
20 25 30 

Glu Thr Ala Ala Glu Glu Gin He Glu Arg Val Ala Val Thr Giy Ser 
35 40 45 

Arg He Ala Lys Ala Glu Leu Thr Gin Pro Ala Pro Val Val Ser Leu 
50 55 60 

Ser Ala Glu Glu Leu Thr Lys Phe Gly Asn Gin Asp Leu Gly Ser Val 
65 70 75 80 

Leu Ala Glu Leu Pro Ala He Gly Ala Thr Asn Thr He He Gly Asn 
85 90 95 



Asn Asn Ser Asn Ser Ser Ala Gly Val Ser Ser Ala Asp Leu Arg Arg 
100 105 110 
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Leu Gly Ala Asn Arg Thr Leu Val Leu Val Asn Gly Lys Arg Tyr Val 
115 . 125 

Ala Gly Gin Pro Gly Ser kla Glu Val Asp Leu Ser Thr He Pro Thr 
130 135 140 



Ser Met He Ser Arg Val Glu He Val Thr Gly Gly Ala Ser Ala He 
145 



150 155 160 



Tvr Gly Ser Asp Ala Val Ser Gly Val He Asn Val He Leu Lys Glu 
165 170 175 

Asp Phe Glu Gly Phe Glu Phe Asn Ala Arg Thr Ser Gly Ser Thr Glu 
180 185 190 

Ser Val Gly Thr Gin Glu His Ser Phe Asp He Leu Gly Gly Ala Asn 
195 200 205 

Val Ala Asp Gly Arg Gly Asn Val Thr Phe Tyr Ala Gly Tyr Glu Arg 
210 215 220 

Th£ Lys Glu Val Met Ala Thr Asp He Arg Gin Phe Asp Ala Trp Gly 
22.5 230 235 240 

Thr He Lys Asn Glu Ala Asp Gly Gly Glu Asp Asp Gly He Pro Asp 
245 250 255 

Arg Leu Arg Val Pro Arg Val Tyr Ser Glu Met He Asn Ala Thr Gly 
260 265 270 

Val He Asn Ala Phe Gly Gly Gly He Gly Arg Ser Thr Phe Asp Ser 
275 280 285 

Asn Gly Asn Pro He Ala Gin Gin Glu Arg Asp Gly Thr Asn Ser Phe 
290 295 300 

Ala Phe Gly Ser Phe Pro Asn Gly Cys Asp Thr Cys Phe Asn Thr Glu 
305 310 315 320 

Ala Tyr Glu Asn Tyr He Pro Gly Val Glu Arg He Asn Val Gly Ser 
325 330 335 

Ser Phe Asn Phe Asp Phe Thr Asp Asn He Gin Phe Tyr Thr Asp Phe 
340 345 350 

Arg Tyr Val Lys Ser Asp He Gin Gin Gin Phe Gin Pro Ser Phe Arg 
355 360 365 
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. -1^ Asn Val Glu Asp Asn Ala Phe Leu Asn Asp 

Phe Gly Asn lie Asn ile Asn Vai y 

370 



^, ^1 M»r Leu ASP Ala Gly Gin Tht Asn Ala Ser Phe 
Asp Leu Arg Gin Gin Met Leu Asp 
385 390 395 

Ma Lys Phe Phe Asp Glu Leu Gly Asn Ar. Ser Ala Glu Asn Lys Arg 

405 

Glu Leu Phe Arg Tyr Val Gly Gly Phe Lys Gly Gly Phe As, He Ser 

420 

Olu Thr lie Phe Asp Tyr Asp Leu Tyr Tyr Val Tyr Gly Glu Thr Asn 
435 

Asn Arg Arg Lys Thr Leu Asn Asp Leu He Pro Asp Asn Phe Val Ala 



450 



455 



460 



Ala val ASP ser Val Ue Asp Pro Asp Thr Gly Leu Ala Ala Cys Arg 



470 



475 



465 

ser Gin Val Ma Ser Ala Gin Gly Asp Asp Tyr Thr Asp Pro Ala Ser 
485 490 495 

val Asn Gly Ser Asp Cys Val Ala Tyr Asn Pro Phe Gly Met Gly Gin 
500 505 510 

Ma ser Ala Glu Ala Arg Asp Trp Val Ser Ala Asp Val Thr Arg Glu 
515 520 525 

ASP Lys lie Thr Gin Gin Val He Gly Gly Thr Leu Gly Thr Asp Ser 



530 



Glu Glu Leu 



535 



540 



Phe Glu Leu Gin Gly Gly Ala Ile Ala Met Val Val Gly 



545 



550 



555 



560 



Phe Glu Tyr Arg 



r Arg Glu Glu Thr Ser Gly Ser Thr Thr Asp Glu Phe Thr 



565 



570 



57 5 



Lys Ala Gly Phe Leu Thr Ser Ala Ala Thr Pro Asp Ser Tyr Gly Glu 
580 585 590 

Tyr Asp val Thr Glu Tyr Phe Val Glu Val Asn Ile Pro Val Leu Lys 

^ 605 



595 



600 



Glu Leu Pro Phe Ala His Glu Leu Ser Phe Asp Gly Ala Tyr Arg Asn 

615 "0 



610 
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c MS. Ma Gly Lvs Thr Glu Ala Trp Lys Ala Gly Met 
Ma Asp Tyr Set His Ala t=iy ^. 

625 "0 

P.e .yr Se. P.o .eu Glu Gin Leu Ala .eu A., Gly X.. Val Gly Clu 

645 

Ma val Arg Ala Pro Asn lie Ala Glu Ala Phe Ser Pro Arg Ser Pro 
660 

Cly Phe Gly Arg Val Ser Asp Pro Cys Asp Ala Asp Asn He Asn Asp 
675 «80 «85 

.sp pro ASP Arg Val Ser Asn Cys Ala Ala Leu Gly He Pro Pro Gly 
690 

P^.e Gin Ala Asn Asp Asn Val Ser Val Asp Thr Leu Ser Gly Gly Asn 
705 

pro ASP Leu Lys Pro Glu Thr Ser Thr Ser Phe Thr Gly Gly Leu Val 
725 '^^^ 

Trp Thr Pro Thr Phe Ala Asp Asn Leu Ser Phe Thr Val Asp Tyr Tyr 
740 '^^ 

;,sp lie Gin He Glu Asp Ala He Leu Ser Val Ala Thr Gin Thr Val 
755 ■'60 

Ala ASP Asn Cys Val Asp Ser Thr Gly Gly Pro Asp Thr Asp Phe Cys 

ser Gin Val Asp Arg Asn Pro Thr Thr Tyr Asp He Glu Leu Val Arg 
785 ■'^O ''^^ 

ser Gly Tyr Leu Asn Ala Ala Ala Leu Asn Thr Lys Gly He Glu Phe 
805 810 815 

Gin Ala Ala Tyr Ser Leu Asp Leu Glu Ser Phe Asn Ala Pro Gly Glu 
820 825 830 

Leu Arg Phe Asn Leu Leu Gly Asn Gin Leu Leu Glu Leu Glu Arg Leu 
835 840 845 

Glu Phe Gin Asn Arg Pro Asp Glu He Asn Asp Glu Lys Gly Glu Val 
850 855 860 

Gly ASP Pro Glu Leu Gin Phe Arg Leu Gly He Asp Tyr Arg Leu Asp 



865 



8T0 



875 
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ASP Leu Ser Val Ser Trp Asn Thr Arg Tyr He Asp Ser Val Val Thr 
885 890 895 



Tyr ASP Val Ser Glu Asn GXy Gly Ser Pro Glu Asp Leu Tyr Pro Gly 

900 

His lie Gly ser Met Thr Thr His Asp Leu Ser Ala Thr Tyr Tyr lie 
915 920 925 

Asn Glu Asn Phe Met lie Asn Gly Gly Val Arg Asn Leu Phe Asp Ala 



930 



Leu Pro Pro 



935 



940 



Gly Tyr Thr Asn Asp Ala Leu Tyr Asp Leu Val Gly Arg 



945 



950 



955 



960 



Arg Ala Phe Leu Gly He Lys Val Met Met 



965 ^-^O 



<210> 6 
<211> 288 
<212> PRT 

<213> Shewanella putrefaciens 



llTlll Lys lie Asn Ser Glu His Leu Asp Glu Ala Thr He Thr Ser 



15 

1 



5 10 



Asn Lys Cys Thr Gin Thr Glu Thr Glu Ala Arg His Arg Asn Ala Thr 

25 30 



20 



Thr Thr Pro Glu Met Arg Arg Phe He Gin Glu Ser Asp Leu Ser Val 
35 ''O 

ser Gin Leu Ser Lys He Leu Asn He Ser Glu Ala Thr Val Arg Lys 
50 55 60 

Trp Arg Lys Arg Asp Ser Val Glu Asn Cys Pro Asn Thr Pro His His 
65 ^0 '° 

Leu Asn Thr Thr Leu Thr Pro Leu Gin Glu Tyr Val Val Val Gly Leu 
85 90 " 

Arg Tyr Gin Leu Lys Met Pro Leu Asp Arg Leu Leu Lys Ala Thr Gin 
100 105 110 

Glu Phe He Asn Pro Asn Val Ser Arg Ser Gly Leu Ala Arg Cys Leu 
115 120 1" 
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Lys Arg Tyr Gly Val 
130 

Pro Met Arg Tyr Phe 
145 

Gin Thr Tyr Thr Leu 
165 

Pro Ser Thr Asp Gly 
180 

Pro Pro Lys Leu Thr 
195 

Asp Pro His Ser Asp 
210 

Thr Gin Ala Thr Asn 
225 

Phe His Leu Arg Lys 
245 

Arg Phe Pro Gly Ala 
260 

Glu Thr lie Asn Lys 
275 



Ser Arg Val Ser Asp He 
135 

Asn Gin He Pro Val Thr 
150 155 

His Tyr Glu Thr Leu Ala 
170 

Asp Asn Val Val Gin Val 
185 

Glu Glu Ala Pro Ser Ser 
200 

Trp He Tyr Leu Asp He 
215 

Arg Tyr Met Ala Tyr Val 
230 235 

Leu Leu Val Arg Asn Tyr 
250 

Thr Gin Asn Arg Arg Pro 
265 

Thr Pro Glu Thr Gin Ala 
280 



Gin Ser Pro His Val 
140 

Gin Gly Ser Asp Val 
160 

Lys Thr Leu Ala Leu 
175 

Val Ser Leu Thr He 
190 

He Leu Leu Gly He 
205 

Tyr Gin Asp Gly Asn 
220 

Leu Lys His Gly Pro 
240 

His Thr Phe Leu Gin 
255 

Ser Lys Asp Met Pro 
270 

Pro Ser Gly Asp Ser 
285 



<210> 7 
<211> 2756 
<212> PRT 

<213> Shewanella putrefaciens 
<400> 7 

Met Ser Gin Thr Ser Lys Pro Thr Asn Ser Ala Thr Glu Gin Ala Gin 
15 10 15 

Asp Ser Gin Ala Asp Ser Arg Leu Asn Lys Arg Leu Lys Asp Met Pro 
20 25 30 

He Ala He Val Gly Met Ala Ser He Phe Ala Asn Ser Arg Tyr Leu 
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35 



40 
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45 



Asn Lys Phe Trp Asp Leu lie Ser Glu Lys lie Asp Ala lie Thr Glu 
50 55 60 

Leu Pro Ser Thr His Trp Gin Pro Giu Giu Tyr Tyr Asp Aia Asp Lys 
65 70 75 80 

Thr Ala Ala Asp Lys Ser Tyr Cys Lys Arg Giy Gly Phe Leu Pro Asp 
85 90 95 

Vai Asp Phe Asn Pro Met Glu Phe Gly Leu Pro Pro Asn lie Leu Glu 
100 105 110 

Leu Thr Asp Ser Ser Gin Leu Leu Ser Leu lie Val Ala Lys Glu Vai 
115 120 125 

Leu Ala Asp Ala Asn Leu Pro Giu Asn Tyr Asp Arg Asp Lys He Gly 
130 135 140 

He Thr Leu Gly Val Gly Gly Gly Gin Lys He Ser His Ser Leu Thr 
145 . 150 155 160 

Ala' Arg Leu Gin Tyr Pro Val Leu Lys Lys Val Phe Ala Asn Ser Gly 
165 170 175 

He Ser Asp Thr Asp Ser Giu Met Leu He Lys Lys Phe Gin Asp Gin 
180 185 190 

Tyr Val His Trp Glu Glu Asn Ser Phe Pro Giy Ser Leu Gly Asn Val 
195 200 205 

He Ala Gly Arg He Ala Asn Arg Phe Asp Phe Gly Gly Met Asn Cys 
210 215 220 

Val Val Asp Ala Ala Cys Ala Giy Ser Leu Ala Ala Met Arg Met Ala 
225 230 235 240 

Leu Thr Glu Leu Thr Glu Giy Arg Ser Glu Met Met He Thr Gly Gly 
245 250 255 

Val Cys Thr Asp Asn Ser Pro Ser Met Tyr Met Ser Phe Ser Lys Thr 
260 265 270 

Pro Ala Phe Thr Thr Asn Giu Thr He Gin Pro Phe Asp He Asp Ser 
275 280 285 

Lys Gly Met Met He Gly Glu Giy He Gly Met Val Ala Leu Lys Arg 
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290 



295 



300 



PCT/USOO/00956 



Leu Glu Asp Ala Glu Arg Asp Gly Asp Arg lie Tyr Ser Val lie Lys 
305 310 315 320 

Gly Val Gly Ala Ser Ser Asp Gly Lys Phe Lys Ser He Tyr Ala Pro 
325 330 335 

Arg Pro Ser Gly Gin Ala Lys Ala Leu Asn Arg Ala Tyr Asp Asp Ala 
340 345 350 

Gly Phe Ala Pro His Thr Leu Gly Leu He Glu Ala His Gly Thr Gly 
355 360 365 

Thr Ala Ala Gly Asp Ala Ala Glu Phe Ala Gly Leu Cys Ser Val Phe 
370 375 380 

Ala Glu Gly Asn Asp Thr Lys Gin His He Ala Leu Gly Ser Val Lys 
385 390 395 400 

Ser Gin He Gly His Thr Lys Ser Thr Ala Gly Thr Ala Gly Leu He 
405 410 415 

Lys Ala Ala Leu Ala Leu His His Lys Val Leu Pro Pro Thr He Asn 
420 425 430 

Val Ser Gin Pro Ser Pro Lys Leu Asp He Glu Asn Ser Pro Phe Tyr 
435 440 445 

Leu Asn Thr Glu Thr Arg Pro Trp Leu Pro Arg Val Asp Gly Thr Pro 
450 455 460 

Arg Arg Ala Gly He Ser Ser Phe Gly Phe Gly Gly Thr Asn Phe His 
465 470 475 480 

Phe Val Leu Glu Glu Tyr Asn Gin Glu His Ser Arg Thr Asp Ser Glu 
485 490 495 

Lys Ala Lys Tyr Arg Gin Arg Gin Val Ala Gin Ser Phe Leu Val Ser 
500 505 510 

Ala Ser Asp Lys Ala Ser Leu He Asn Glu Leu Asn Val Leu Ala Ala 
515 520 525 

Ser Ala Ser Gin Ala Glu Phe He Leu Lys Asp Ala Ala Ala Asn Tyr 
530 535 540 



Gly Val Arg Glu Leu Asp Lys Asn Ala Pro Arg He Gly Leu Val Ala 
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545 



550 



555 



PCT/USOO/00956 

560 



Asn Thr Ala Glu Glu Leu Ala Giy teu lie Lys Gin Ala Leu Ala Lys 
565 570 575 

Leu Ala Ala Ser Asp Asp Asn Ala Trp Gin Leu Pro Gly Gly Thr Ser 
580 585 590 

Tyr Arg Ala Ala Ala Val Glu Gly Lys Val Ala Ala Leu Phe Ala Gly 
595 600 605 

Gin Gly Ser Gin Tyr Leu Asn Met Gly Arg Asp Leu Thr Cys Tyr Tyr 
610 615 620 

Pro Glu Met Arg Gin Gin Phe Val Thr Ala Asp Lys Val Phe Ala Ala 
625 630 635 640 

Asn Asp Lys Thr Pro Leu Ser Gin Thr Leu Tyr Pro Lys Pro Val Phe 
645 650 655 

Asn Lys Asp Glu Leu Lys Ala Gin Glu Ala He Leu Thr Asn Thr Ala 
660 665 670 

Asn -Ala Gin Ser Ala lie Gly Ala He Ser Met Gly Gin Tyr Asp Leu 
675 690 685 

Phe Thr Ala Ala Gly Phe Asn Ala Asp Met Val Ala Gly His Ser Phe 
690 695 700 

Gly Glu Leu Ser Ala Leu Cys Ala Ala Gly Val He Ser Ala Asp Asp 
705 ' 710 715 720 

Tyr Tyr Lys Leu Ala Phe Ala Arg Gly Glu Ala Met Ala Thr Lys Ala 
725 730 735 

Pro Ala Lys Asp Gly Val Glu Ala Asp Ala Gly Ala Met Phe Ala He 
740 745 750 

He Thr Lys Ser Ala Ala Asp Leu Glu Thr Val Glu Ala Thr He Ala 
755 760 765 

Lys Phe Asp Gly Val Lys Val Ala Asn Tyr Asn Ala Pro Thr Gin Ser 
770 775 780 

Val He Ala Gly Pro Thr Ala Thr Thr Ala Asp Ala Ala Lys Ala Leu 
785 790 795 800 

Thr Glu Leu Gly Tyr Lys Ala He Asn Leu Pro Val Ser Gly Ala Phe 
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805 810 



815 



His Thr Glu Leu Val Gly His Ale Gin Ala Pro Phe Ala Lys Ala He 
820 825 830 

Asp Ala Ala Lys Phe Thr Lys Thr Ser Arg Ala Leu Tyr Ser Asn Ala 
835 840 845 

Thr Gly Gly Leu Tyr Glu Ser Thr Ala Ala Lys He Lys Ala Ser Phe 
850 855 860 

Lvs Lys His Met Leu Gin Ser Val Arg Phe Thr Ser Gin Leu Glu Ala 
865 870 875 880 

Met Tyr Asn Asp Gly Ala Arg Val Phe Val Glu Phe Gly Pro Lys Asn 
385 890 895 

He Leu Gin Lys Leu Val Gin Gly Thr Leu Val Asn Thr Glu Asn Glu 
900 905 910 

Val Cys Thr lie Ser He Asn Pro Asn Pro Lys Val Asp Ser Asp Leu 
915 920 925 

Gin Leu Lys Gin Ala Ala Met Gin Leu Ala Val Thr Gly Val Val Leu 
930 935 940 

Ser Glu He Asp Pro Tyr Gin Ala Asp He Ala Ala Pro Ala Lys Lys 
945 950 955 960 

Ser Pro Met Ser He Ser Leu Asn Ala Ala Asn His He Ser Lys Ala 
965 970 975 

Thr Arg Ala Lys Met Ala Lys Ser Leu Glu Thr Gly He Val Thr Ser 
980 985 990 

Gin He Glu His Val He Glu Glu Lys He Val Glu Val Glu Lys Leu 
995 1000 1005 

Val Glu Val Glu Lys He Val Glu Lys Val Val Glu Val Glu Lys Val 
1010 1015 1020 

Val Glu Val Glu Ala Pro Val Asn Ser Val Gin Ala Asn Ala He Gin 
1025 1030 1035 1040 

Thr Arg Ser Val Val Ala Pro Val He Glu Asn Gin Val Val Ser Lys 
1045 1050 1055 

Asn Ser Lys Pro Ala Val Gin Ser He Ser Gly Asp Ala Leu Ser Asn 



28 



wo 00/42195 

1060 
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1070 



Phe Phe Ala Ala Gin Gin Gin Thr Ala Gin Leu His Gin Gin Phe Leu 
1075 i080 1085 

Ala lie Pro Gin Gin Tyr Gly Glu Thr Phe Thr Thr Leu Met Thr Glu 
1090 1095 1100 

Gin Ala Lys Leu Ala Ser Ser Gly Val Ala lie Pro Glu Ser Leu Gin 
1105 1110 1115 1120 

Arg Ser Met Glu Gin Phe His Gin Leu Gin Ala Gin Thr Leu Gin Ser 
1125 1130 1135 

His Thr Gin Phe Leu Glu Met Gin Ala Gly Ser Asn He Ala Ala Leu 
1140 1145 1150 

Asn Leu Leu Asn Ser Ser Gin Ala Thr Tyr Ala Pro Ala He His Asn 
1155 116C 1165 

Glu Ala He Gin Ser Gin Val Val Gin Ser Gin Thr Ala Val Gin Pro 
1170 1175 1180 

Val' He Ser Thr. Gin Val Asn His Val Ser Glu Gin Pro Thr Gin Ala 
1185 1190 1195 1200 

Pro Ala Pro Lys Ala Gin Pro Ala Pro Val Thr Thr Ala Val Gin Thr 
1205 1210 1215 

Ala Pro Ala Gin Val Val Arg Gin Ala Ala Pro Val Gin Ala Ala He 
1220 1225 1230 

Glu Pro He Asn Thr Ser Val Ala Thr Thr Thr Pro Ser Ala Phe Ser 
1235 1240 1245 

Ala Glu Thr Ala Leu Ser Ala Thr Lys Val Gin Ala Thr Met Leu Glu 
1250 1255 1260 

Val Val Ala Glu Lys Thr Gly Tyr Pro Thr Glu Met Leu Glu Leu Glu 
1265 1270 1275 1280 

Met Asp Met Glu Ala Asp Leu Gly He Asp Ser He Lys Arg Val Glu 
1285 1290 1295 

He Leu Gly Thr Val Gin Asp Glu Leu Pro Gly Leu Pro Glu Leu Ser 
1300 1305 1310 

Pro Glu Asp Leu Ala Glu Cys Arg Thr Leu Gly Glu He Val Asp Tyr 



29 



wo 00/42195 

1315 
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1325 



Met Gly Ser Lys Leu Pro Ala Glu (^ly Ser Met Asn Ser Gin Leu Ser 
1330 1335 1340 

Thr Gly Ser Ala Ala Ala Thr Pro Ala Ala Asn Gly Leu Ser Ala Glu 
1345 1350 1355 1360 

Lys Val Gin Ala Thr Met Met Ser Val Val Ala Glu Lys Thr Gly Tyr 
1365 1370 1375 

Pro Thr Glu Met Leu Glu Leu Glu Met Asp Met Glu Ala Asp Leu Gly 
1380 1385 1390 

lie Asp Ser lie Lys Arg Val Glu He Leu Gly Thr Val Gin Asp Glu 
1395 1400 1405 

Leu Pro Gly Leu Pro Glu Leu Ser Pro Glu Asp Leu Ala Glu Cys Arg 
1410 1415 1420 

Thr Leu Gly Glu He Val Asp Tyr Met Asn Ser Lys Leu Ala Asp Gly 
142& 1430 1435 1440 

Ser' Lys Leu Pro Ala Glu Gly Ser Met Asn Ser Gin Leu Ser Thr Ser 
1445 1450 1455 

Ala Ala Ala Ala Thr Pro Ala Ala Asn Gly Leu Ser Ala Glu Lys Val 
1460 1465 1470 

Gin Ala Thr Met Met Ser Val Val Ala Glu Lys Thr Gly Tyr Pro Thr 
1475 1480 1485 

Glu Met Leu Glu Leu Glu Met Asp Met Glu Ala Asp Leu Gly He Asp 
1490 1495 1500 

Ser He Lys Arg Val Glu He Leu Gly Thr Val Gin Asp Glu Leu Pro 
1505 1510 1515 1520 

Gly Leu Pro Glu Leu Asn Pro Glu Asp Leu Ala Glu Cys Arg Thr Leu 
1525 1530 1535 

Gly Glu He Val Thr Tyr Met Asn Ser Lys Leu Ala Asp Gly Ser Lys 
1540 1545 1550 

Leu Pro Ala Glu Gly Ser Met His Tyr Gin Leu Ser Thr Ser Thr Ala 
1555 1560 1565 

Ala Ala Thr Pro Val Ala Asn Gly Leu Ser Ala Glu Lys Val Gin Ala 
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1570 iblb 

Thr Met Met Ser Vol Val Ala Asp Lys Thr Gly Tyr Pro Thr Giu Met 
1585 1590 1595 1600 

Leu Glu Leu Giu Met Asp Met Glu Ala Asp Leu Gly lie Asp Ser lie 
1605 1610 1615 

Lys Arg Val Giu lie Leu Gly Thr Val Gin Asp Glu Leu Pro Gly Leu 
1620 1625 1630 

Pro Glu Leu Asn Pro Glu Asp Leu Ala Glu Cys Arg Thr Leu Gly Glu 
1635 1640 1645 

lie Val Asp Tyr Met Gly Ser Lys Leu Pro Ala Glu Gly Ser Ala Asn 
1650 1655 1660 

Thr Ser Ala Ala Ala Ser Leu Asn Val Ser Ala Val Ala Ala Pro Gin 
1665 1670 1675 1680 

Ala Ala Ala Thr Pro Val Ser Asn Gly Leu Ser Ala Glu Lys Val Gin 
1685 1690 1695 

Ser- Thr Met Met Ser Val Val Ala Glu Lys Thr Gly Tyr Pro Thr Glu 
1700 1705 1710 

Met Leu Glu Leu Gly Met Asp Met Glu Ala Asp Leu Gly He Asp Ser 
1715 1720 1725 

He Lys Arg Val Glu He Leu Gly Thr Val Gin Asp Glu Leu Pro Gly 
1730 1735 1740 

Leu Pro Glu Leu Asn Pro Glu Asp Leu Ala Glu Cys Arg Thr Leu Gly 
1745 1750 1755 1760 

Glu He Val Asp Tyr Met Asn Ser Lys Leu Ala Asp Gly Ser Lys Leu 
1765 1770 1775 

Pro Ala Glu Gly Ser Ala Asn Thr Ser Ala Thr Ala Ala Thr Pro Ala 
1780 1785 1790 

Val Asn Gly Leu Ser Ala Asp Lys Val Gin Ala Thr Met Met Ser Val 
1795 IBOO 1805 

Val Ala Glu Lys Thr Gly Tyr Pro Thr Glu Met Leu Glu Leu Gly Met 
1810 1815 1820 

Asp Met Glu Ala Asp Leu Gly He Asp Ser He Lys Arg Val Glu He 
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1825 



1830 1835 1840 



Leu Gly Thr Val Gin Asp Glu Leu Pro Giy Leu Pro Glu Leu Asn Pro 
1845 1850 1855 

Glu Asp Leu Ala Glu Cys Arg Thr Leu Gly Glu lie Val Ser Tyr Met 
I860 1865 1870 

Asn Ser Gin Leu Ala Asp Gly Ser Lys Leu Ser Thr Ser Ala Ala Glu 
1875 1880 1865 

Gly Ser Ala Asp Thr Ser Ala Ala Asn Ala Ala Lys Pro Ala Ala lie 
1690 1895 1900 

Ser Ala Glu Pro Ser Val Glu Leu Pro Pro His Ser Glu Val Ala Leu 
1905 1910 1915 1920 

Lys Lys Leu Asn Ala Ala Asn Lys Leu Glu Asn Cys Phe Ala Ala Asp 
1925 1930 1935 

Ala Ser Val Val He Asn Asp Asp Gly His Asn Ala Gly Val Leu Ala 
1940 1945 1950 

Glu- Lys Leu He Lys Gin Gly Leu Lys Val Ala Val Val Arg Leu Pro 
1955 I960 1965 

Lys Gly Gin Pro Gin Ser Pro Leu Ser Ser Asp Val Ala Ser Phe Glu 
1970 1975 1980 

Leu Ala Ser Ser Gin Glu Ser Glu Leu Glu Ala Ser He Thr Ala Val 
1985 1990 1995 2000 

He Ala Gin He Glu Thr Gin Val Gly Ala He Gly Gly Phe He His 
2005 2010 2015 

Leu Gin Pro Glu Ala Asn Thr Glu Glu Gin Thr Ala Val Asn Leu Asp 
2020 2025 2030 

Ala Gin Ser Phe Thr His Val Ser Asn Ala Phe Leu Trp Ala Lys Leu 
2035 2040 2045 

Leu Gin Pro Lys Leu Val Ala Gly Ala Asp Ala Arg Arg Cys Phe Val 
2050 2055 2060 

Thr Val Ser Arg He Asp Gly Gly Phe Gly Tyr Leu Asn Thr Asp Ala 
2065 2070 2075 2080 

Leu Lys Asp Ala Glu Leu Asn Gin Ala Ala Leu Ala Gly Leu Thr Lys 
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208S 2050 209b 

Thr Leu Ser His Glu Trp Pro Gin Val Phe Cys Arg Ala Leu Asp He 
2100 2105 2110 

Ala Thr Asp Val Asp Ala Thr His Leu Ala Asp Ala He Thr Ser Glu 
2115 2120 2125 

Leu Phe Asp Ser Gin Ala Gin Leu Pro Glu Val Gly Leu Ser Leu He 
2130 2135 2140 

Asp Gly Lys Val Asn Arg Val Thr Leu Val Ala Ala Glu Ala Ala Asp 
2145 2150 2155 2160 

Lys Thr Ala Lys Ala Glu Leu Asn Ser Thr Asp Lys He Leu Val Thr 
2165 2170 2175 

Gly Gly Ala Lys Gly Val Thr Phe Glu Cys Ala Leu Ala Leu Ala Ser 
2180 2185 2190 

Arg Ser Gin Ser His Phe He Leu Ala Gly Arg Ser Glu Leu Gin Ala 
2195 2200 2205 

Leu -Pro Ser Trp Ala Glu Gly Lys Gin Thr Ser Glu Leu Lys Ser Ala 
2210 2215 2220 

Ala He Ala His He He Ser Thr Gly Gin Lys Pro Thr Pro Lys Gin 
2225 2230 2235 2240 

Val Glu Ala Ala Val Trp Pro Val Gin Ser Ser He Glu He Asn Ala 
2245 2250 22.55 

Ala Leu Ala Ala Phe Asn Lys Val Gly Ala Ser Ala Glu Tyr Val Ser 
2260 2265 2270 

Met Asp Val Thr Asp Ser Ala Ala He Thr Ala Ala Leu Asn Gly Arg 
2275 2280 2285 

Ser Asn Glu He Thr Gly Leu He His Gly Ala Gly Val Leu Ala Asp 
2290 2295 2300 

Lys His He Gin Asp Lys Thr Leu Ala Glu Leu Ala Lys Val Tyr Gly 
2305 2310 2315 2320 

Thr Lys Val Asn Gly Leu Lys Ala Leu Leu Ala Ala Leu Glu Pro Ser 
2325 2330 2335 

Lys He Lys Leu Leu Ala Met Phe Ser Ser Ala Ala Gly Phe Tyr Gly 
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2340 234^3 2350 

Asn He Giy Gin Ser Asp Tyr Ala Met Ser Asn Asp lie Leu Asn Lys 
2355 2360 2365 

Ala Ala Leu Gin Phe Thr Ala Arg Asn Pro Gin Ala Lys Val Met Ser 
2370 2375 2380 

Phe Asn Trp Gly Pro Trp Asp Gly Gly Met Val Asn Pro Ala Leu Lys 
2385 2390 2395 2400 

Lys Met Phe Thr Glu Arg Gly Val Tyr Val He Pro Leu Lys Ala Gly 
2405 2410 2415 

Ala Glu Leu Phe Ala Thr Gin Leu Leu Ala Glu Thr Gly Val Gin Leu 
2420 2425 2430 

Leu He Gly Thr Ser Met Gin Gly Gly Ser Asp Thr Lys Ala Thr Glu 
2435 2440 2445 

Thr Ala Ser Val Lys Lys Leu Asn Ala Gly Glu Val Leu Ser Ala Ser 
2450 2455 2460 

His "Pro Arg Ala Gly Ala Gin Lys Thr Pro Leu Gin Ala Val Thr Ala 
2465 2470 2475 2480 

Thr Arg Leu Leu Thr Pro Ser Ala Met Val Phe He Glu Asp His Arg 
2485 2490 2495 

He Gly Gly Asn Ser Val Leu Pro Thr Val Cys Ala He Asp Trp Met 
2500 2505 2510 

Arg Glu Ala Ala Ser Asp Met Leu Gly Ala Gin Val Lys Val Leu Asp 
2515 2520 2525 

Tyr Lys Leu Leu Lys Gly He Val Phe Glu Thr Asp Glu Pro Gin Glu 
2530 2535 2540 

Leu Thr Leu Glu Leu Thr Pro Asp Asp Ser Asp Glu Ala Thr Leu Gin 
2545 2550 2555 2560 

Ala Leu He Ser Cys Asn Gly Arg Pro Gin Tyr Lys Ala Thr Leu He 
2565 2570 2575 

Ser Asp Asn Ala Asp He Lys Gin Leu Asn Lys Gin Phe Asp Leu Ser 
2580 2585 2590 

Ala Lys Ala He Thr Thr Ala Lys Glu Leu Tyr Ser Asn Gly Thr Leu 
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2595 2600 2605 

Phe His Gly Pro Arg Leu Gin Gly lie Gin Ser Val Val Gin Phe Asp 
2610 2615 2620 

Asp Gin Gly Leu He Ala Lys Val Ala Leu Pro Lys Val Glu Leu Ser 
2625 2630 2635 2640 

Asp Cys Gly Glu Phe Leu Pro Gin Thr His Met Gly Gly Ser Gin Pro 
2645 2650 2655 



Phe Ala Glu Asp Leu Leu Leu Gin Ala Met Leu Val Trp Ala Arg Leu 
2660 2665 2670 

Lys Thr Gly Ser Ala Ser Leu Pro Ser Ser He Gly Glu Phe Thr Ser 
2675 2680 2685 

Tyr Gin Pro Met Ala Phe Gly Glu Thr Gly Thr He Glu Leu Glu Val 
2690 2695 2700 

He Lys His Asn Lys Arg Ser Leu Glu Ala Asn Val Ala Leu Tyr Arg 
2705 2710 2715 2720 

Asp* Asn Gly Glu Leu Ser Ala Met Phe Lys Ser Ala Lys He Thr He 
2725 2730 2735 

Ser Lys Ser Leu Asn Ser Ala Phe Leu Pro Ala Val Leu Ala Asn Asp 
2740 2745 2750 



Ser Glu Ala Asn 
2755 



<210> 8 
<211> 771 
<212> PRT 

<213> Shewanella putrefaciens 



<400> 8 

Met Pro Leu Arg He Ala Leu He Leu Leu Pro Thr Pro Gin Phe Glu 
15 10 15 

Val Asn Ser Val Asp Gin Ser Val Leu Ala Ser Tyr Gin Thr Leu Gin 
20 25 30 

Pro Glu Leu Asn Ala Leu Leu Asn Ser Ala Pro Thr Pro Glu Met Leu 
35 40 45 
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Ser lie Thr lie Set Asp Asp Ser Asp A.la Asn Ser Phe Glu Ser Gin 
bO 55 60 

Leu Asn Ala Ala Thr Asn Ala He Asn Asn Gly Tyr He Val Lys Leu 
65 70 75 80 

Ala Thr Ala Thr His Ala Leu Leu Met Leu Pro Ala Leu Lys Ala Ala 
85 90 95 

Gin Met Arg He His Pro His Ala Gin Leu Ala Ala Met Gin Gin Ala 
100 105 110 

Lys Ser Thr Pro Met Ser Gin Val Ser Gly Glu Leu Lys Leu Gly Ala 
115 120 125 

Asn Ala Leu Ser Leu Ala Gin Thr Asn Ala Leu Ser His Ala Leu Ser 
130 135 140 

Gin Ala Lys Arg Asn Leu Thr Asp Val Ser Val Asn Glu Cys Phe Glu 
145 150 155 160 

Asn •Leu Lys Ser Glu Gin Gin Phe Thr Glu Val Tyr Ser Leu He Gin 
165 170 175 

Gin Leu Ala Ser Arg Thr His Val Arg Lys Glu Val Asn Gin Gly Val 
180 185 190 

Glu Leu Gly Pro Lys Gin Ala Lys Ser His Tyr Trp Phe Ser Glu Phe 
195 200 205 

His Gin Asn Arg Val Ala Ala He Asn Phe He Asn Gly Gin Gin Ala 
210 215 220 

Thr Ser Tyr Val Leu Thr Gin Gly Ser Gly Leu Leu Ala Ala Lys Ser 
225 230 235 240 

Met Leu Asn Gin Gin Arg Leu Met Phe He Leu Pro Gly Asn Ser Gin 
245 250 255 

Gin Gin He Thr Ala Ser He Thr Gin Leu Met Gin Gin Leu Glu Arg 
260 265 270 

Leu Gin Val Thr Glu Val Asn Glu Leu Ser Leu Glu Cys Gin Leu Glu 
275 280 285 

Leu Leu Ser He Met Tyr Asp Asn Leu Val Asn Ala Asp Lys Leu Thr 
290 295 300 
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ila Tyt Gin Ala Val He Gin Ala Ser Ser 



Thr Arg Asp Ser Lys Pro Ala Tyr 
30b 310 



315 320 



Val ser Ala Ala Lys Gin Glu Leu Ser Ala Leu Asn Asp Ala Leu Thr 
325 330 335 



Ala Leu Phe Ala Glu Gin Thr Asn Ala Thr Ser Thr Asn Lys Gly Leu 
340 345 350 

He Gin Tyr Lys Thr Pro Ala Gly Ser Tyr Leu Thr Leu Thr Pro Leu 
355 360 365 

Gly ser Asn Asn Asp Asn Ala Gin Ala Gly Leu Ala Phe Val Tyr Pro 
370 375 380 

Gly Val Gly Thr Val Tyr Ala Asp Met Leu Asn Glu Leu His Gin Tyr 
385 390 395 400 

Phe Pro Ala Leu Tyr Ala Lys Leu Glu Arg Glu Gly Asp Leu Lys Ala 
405 415 

Met ieu Gin Ala Glu Asp He Tyr His Leu Asp Pro Lys His Ala Ala 
420 425 430 

Gin Met ser Leu Gly Asp Leu Ala He Ala Gly Val Gly Ser Ser Tyr 
435 440 445 

Leu Leu Thr Gin Leu Leu Thr Asp Glu Phe Asn He Lys Pro Asn Phe 
450 455 460 

Ala Leu Gly Tyr Ser Met Gly Glu Ala Ser Met Trp Ala Ser Leu Gly 



465 



470 475 480 



Val Trp Gin Asn Pro His Ala Leu He Ser Lys Thr Gin Thr Asp Pro 
485 490 495 

Leu Phe Thr Ser Ala He Ser Gly Lys Leu Thr Ala Val Arg Gin Ala 
500 505 510 

Trp Gin Leu Asp Asp Thr Ala Ala Glu He Gin Trp Asn Ser Phe Val 
515 520 525 

Val Arg Ser Glu Ala Ala Pro He Glu Ala Leu Leu Lys Asp Tyr Pro 
530 535 540 



His Ala Tyr Leu Ala He He Gin Gly Asp Thr Cys Val He Ala Gly 
545 



550 555 560 
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Cys Glu He Gin Cys Lys Ala Leu Leu Ala Ala Leu Gly Lys Arg Gly 
5€S 5-70 575 

He Ala Ala Asn Arg Val Thr Ala Met Kis Thr Gin Pro Ala Met Gin 
580 585 590 

Glu His Gin Asn Val Met Asp Phe Tyr Leu Gin Pro Leu Lys Ala Glu 
595 600 605 

Leu Pro Ser Glu lie Ser Phe He Ser Ala Ala Asp Leu Thr Ala Lys 
610 ^20 

Gin Thr Val Ser Glu Gin Ala Leu Ser Ser Gin Val Val Ala Gin Ser 
625 630 635 640 

He Ala Asp Thr Phe Cys Gin Thr Leu Asp Phe Thr Ala Leu Val His 
645 650 655 

His Ala Gin His Gin Gly Ala Lys Leu Phe Val Glu He Gly Ala Asp 
660 665 670 

Arg -Gin Asn Cys Thr Leu He Asp Lys He Val Lys Gin Asp Gly Ala 
675 680 685 

Ser Ser Val Gin His Gin Pro Cys Cys Thr Val Pro Met Asn Ala Lys 
690 695 700 

Gly Ser Gin Asp He Thr Ser Val He Lys Ala Leu Gly Gin Leu He 
705 710 715 720 

Ser His Gin Val Pro Leu Ser Val Gin Pro Phe He Asp Gly Leu Lys 
725 730 735 

Arg Glu Leu Thr Leu Cys Gin Leu Thr Ser Gin Gin Leu Ala Ala His 
740 745 750 

Ala Asn Val Asp Ser Lys Phe Glu Ser Asn Gin Asp His Leu Leu Gin 
755 '760 765 

Gly Glu Val 
770 



<210> 9 
<211> 2004 
<212> PRT 

<213> Shewanella putrefaciens 
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<400> 9 

Me- Ser Leu Pro Asp Asn Ala Ser Asn His Leu Ser Ala Asn Gin Lys 
1 5 10 15 



Gly Ala Ser Gin Ala Ser Lys Thr Ser Lys Gin Ser Lys lie Ala lie 

20 25 30 

Val Gly Leu Ala Thr Leu Tyr Pro Asp Ala Lys Thr Pro Gin Glu Phe 

35 40 45 

Trp Gin Asn Leu Leu Asp Lys Arg Asp Ser Arg Ser Thr Leu Thr Asn 

50 55 60 

Glu Lys Leu Gly Ala Asn Ser Gin Asp Tyr Gin Gly Val Gin Gly Gin 



65 



70 75 80 



Ser Asp Arg Phe Tyr Cys Asn Lys Gly Gly Tyr He Glu Asn Phe Ser 
85 90 95 

Phe Asn Ala Ala Gly Tyr Lys Leu Pro Glu Gin Ser Leu Asn Gly Leu 
100 105 110 

Asp Asp Ser Phe Leu Trp Ala Leu Asp Thr Ser Arg Asn Ala Leu He 
115 120 125 

Asp Ala Gly lie Asp He Asn Gly Ala Asp Leu Ser Arg Ala Gly Val 
130 135 140 

Val Met Gly Ala Leu Ser Phe Pro Thr Thr Arg Ser Asn Asp Leu Phe 
145 150 155 160 

Leu Pro He Tyr His Ser Ala Val Glu Lys Ala Leu Gin Asp Lys Leu 
165 170 l^S 

. Gly Val Lys Ala Phe Lys Leu Ser Pro Thr Asn Ala His Thr Ala Arg 
180 185 190 

Ala Ala Asn Glu Ser Ser Leu Asn Ala Ala Asn Gly Ala He Ala His 
195 200 205 

Asn Ser Ser Lys Val Val Ala Asp Ala Leu Gly Leu Gly Gly Ala Gin 
210 215 220 

Leu Ser Leu Asp Ala Ala Cys Ala Ser Ser Val Tyr Ser Leu Lys Leu 
225 230 235 240 

Ala Cys Asp Tyr Leu Ser Thr Gly Lys Ala Asp He Met Leu Ala Gly 
245 250 255 
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Ala Val Ser Gly Ala Asp Pro Phe Phe lie Asn Met Gly Phe £er lie 
260 265 270 

Phe His Ala Tyr Pro Asp His Gly lie Ser Val Pro Phe Asp Ala Ser 
275 280 285 

Ser Lys Gly Leu Phe Ala Gly Glu Gly Ala Gly Val Leu Val Leu Lys 
290 295 300 

Arg Leu Glu Asp Ala Glu Arg Asp Asn Asp Lys lie Tyr Ala Val Val 
305 310 315 320 

Ser Gly Val Gly Leu Ser Asn Asp Gly Lys Gly Gin Phe Val Leu Ser 
325 330 335 

Pro Asn Pro Lys Gly Gin Val Lys Ala Phe Glu Arg Ala Tyr Ala Ala 
340 345 350 

Ser Asp lie Glu Pro Lys Asp He Glu Val He Glu Cys His Ala Thr 
355 360 365 

Gly Thr Pro Leu Gly Asp Lys He Glu Leu Thr Ser Met Glu Thr Phe 
•370 375 380 

Phe Glu Asp Lys Leu Gin Gly Thr Asp Ala Pro Leu He Gly Ser Ala 
385 390 395 400 

Lys Ser Asn Leu Gly His Leu Leu Thr Ala Ala His Ala Gly He Met 
405 410 415 

Lys Met He Phe Ala Met Lys Glu Gly Tyr Leu Pro Pro Ser He Asn 
420 425 430 

He Ser Asp Ala He Ala Ser Pro Lys Lys Leu Phe Gly Lys Pro Thr 
435 440 445 

Leu Pro Ser Met Val Gin Gly Trp Pro Asp Lys Pro Ser Asn Asn His 
450 455 460 

Phe Gly Val Arg Thr Arg His Ala Gly Val Ser Val Phe Gly Phe Gly 
465 470 475 480 

Gly Cys Asn Ala His Leu Leu Leu Glu Ser Tyr Asn Gly Lys Gly Thr 
485 490 495 



Val Lys Ala Glu Ala Thr Gin Val Pro Arg Gin Ala Glu Pro Leu Lys 
500 505 510 
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Val Val Gly Leu Ala Ser His Phe Oly Pro Leu Ser Ser He Asn Ala 
515 525 

Leu Asn Asn Ala Val Thr Gin Asp Gly Asn Gly Phe lie Glu Leu Pro 



530 



535 540 



Lys Lys Rrg Trp Lys Gly Leu Glu Lys His Ser Glu Leu Leu Ala Glu 

555 560 



545 550 



Phe Gly Leu Ala Ser Ala Pro Lys Gly Ala Tyr Val Asp Asn Phe Glu 
565 570 575 

Leu Asp Phe Leu Arg Phe Lys Leu Pro Pro Asn Glu Asp Asp Arg Leu 
580 585 590 

He ser Gin Gin Leu Met Leu Met Arg Val Thr Asp Glu Ala lie Arg 
595 

Asp Ala Lys Leu Glu Pro Gly Gin Lys Val Ala Val Leu Val Ala Met 
610 "5 620 



Glu 
625 



Thr Glu Leu Glu Leu His Gin Phe Arg Gly Arg Val Asn Leu His 
630 635 640 



Thr Gin Leu Ala Gin Ser Leu Ala Ala Met Gly Val Ser Leu Ser Thr 
645 650 655 

ASP Glu Tyr Gin Ala Leu Glu Ala He Ala Met Asp Ser Val Leu Asp 
660 665 670 

Ala Ala Lys Leu Asn Gin Tyr Thr Ser Phe He Gly Asn He Met Ala 
675 680 685 

Ser Arg Val Ala Ser Leu Trp Asp Phe Asn Gly Pro Ala Phe Thr He 
690 695 700 

Ser Ala Ala Glu Gin Ser Val Ser Arg Cys He Asp Val Ala Gin Asn 
705 •'lO ■'15 ''20 

Leu He Met Glu Asp Asn Leu Asp Ala Val Val He Ala Ala Val Asp 
725 "JSO ■'^5 

Leu Ser Gly Ser Phe Glu Gin Val He Leu Lys Asn Ala He Ala Pro 
740 ■'50 

Val Ala He Glu Pro Asn Leu Glu Ala Ser Leu Asn Pro Thr Ser Ala 
755 760 765 
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Ser Trp Asn Val Gly Glu Gly Ala Giy Aia Val Val Leu Val Lys Asn 
770 775 780 

Glu Ala Thr Ser Gly Cys Ser Tyr Gly Gin lie Asp Ala Leu Gly Phe 
785 790 795 800 

Ala Lys Thr Ala Glu Thr Ala Leu Ala Thr Asp Lys Leu Leu Ser Gin 
805 810 815 

Thr Ala Thr Asp Phe Asn Lys Val Lys Val He Glu Thr Met Ala Ala 
820 825 830 

Pro Ala Ser Gin He Gin Leu Ala Pro He Val Ser Ser Gin Val Thr 
835 840 845 

His Thr Ala Ala Glu Gin Arg Val Gly His Cys Phe Ala Ala Ala Gly 
850 855 860 

Met Ala Ser Leu Leu His Gly Leu Leu Asn Leu Asn Thr Val Ala Gin 
865 870 875 880 

Thr Asn Lys Ala Asn Cys Ala Leu lie Asn Asn He Ser Glu Asn Gin 
885 890 895 

Leu Ser Gin Leu Leu He Ser Gin Thr Ala Ser Glu Gin Gin Ala Leu 
900 905 910 

Thr Ala Arg Leu Ser Asn Glu Leu Lys Ser Asp Ala Lys His Gin Leu 
915 920 925 

Val Lys Gin Val Thr Leu Gly Gly Arg Asp He Tyr Gin His He Val 
930 935 940 

Asp Thr Pro Leu Ala Ser Leu Glu Ser He Thr Gin Lys Leu Ala Gin 
945 950 955 960 

Ala Thr Ala Ser Thr Val Val Asn Gin Val Lys Pro He Lys Ala Ala 
965 970 975 

Gly Ser Val Glu Met Ala Asn Ser Phe Glu Thr Glu Ser Ser Ala Glu 
980 985 990 

Pro Gin He Thr He Ala Ala Gin Gin Thr Ala Asn He Gly Val Thr 
995 1000 1005 



Ala Gin Ala Thr Lys Arg Glu Leu Gly Thr Pro Pro Met Thr Thr Asn 
1010 1015 1020 
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Thr lie Ala hsn The Ma Asn Asn Leu Asp Lys Thr Leu GIu Thr Val 
1025 1030 1035 1040 

Ala Gly Asn Thr Val Ala Ser hys Val Gly Ser Gly Asp lie Val Asn 
104S 1050 1055 

Phe Gin Gin Asn Gin Gin Leu Ala Gin Gin Ala His Leu Ala Phe Leu 
1060 1065 1070 

Glu Ser Axg Sei: Ala Gly Met Lys Val Ala Asp Ala Leu Leu Lys Gin 
1075 1080 1085 

Gin Leu Ala Cln Val Thr Gly Gin Thr lie Asp Asn Gin Ala - Leu A3p 
1090 1095 1100 

Thr Gin Ala Val Asp Thr Gin Thr Ser Glu Asn Val Ala Xle Ala Ala 
1105 1110 1115 1120 

Glu Ser Pro Val Gin Val Thr Thr Pro Val Gin Val Thr The Pro Val 
1125 1130 1135 

GlA lie Ser Val Val Glu Leu Lys Pro Asp His Ala Asn Val Pro Pro 
1140 1145 1150 

Tyr Thr Pro Pro Val Pro Ala Leu Lys Pro Cys He Trp Asn Tyir Ala 
1155 lieO 11€5 

Asp Leu Val Glu Tyr Ala Glu Gly Asp He Ala Lys Val Phe Gly Ser 
1170 1175 lldO 

Asp Tyr Ala He He Asp Sec Tyr Ser Arg Arg Val Arg Leu Pro Thr 
1185 1190 1195 1200 

Thr Asp Tyr Lea Leu Val Ser Arg Val Thr Lys Leu Asp Ala Thr He 
1205 1210 1215 

Asn Gin Phe Lys Pro Cys Ser Met Thr Thr Glu Tyr Asp He Pro Val 
1220 1225 1230 

Asp Ala Pro Tyr Leu Val Asp Gly Gin He Pro Trp Ala Val Ala Val 
1235 1240 1245 

Glu Ser Gly Gin Cys Asp Leu Met Leu He Ser Tyr Leu Gly He Asp 
1230 1255 1260 

Phe Glu Asn Lys Gly Glu Arg Val Tyr Arg Leu Leu Asp Cys Thr Leu 
1265 1270 1275 12B0 
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Thr Phe Leu Gly Asp Leu Pro Arg Giy G.ly Asp Thr Leu Arg Tyr Asp 
1285 X290 i295 

lie Lys rle Asn Asn Tyr Ala Arg Asn Gly Asp Thr Leu Leu Phe Phe 
1300 1305 1310 

Phe Ser Tyr Glu Cys Phe Val Gly Asp Lys Met lie Leu Lys Met Asp 
1315 1320 1325 

Gly Gly Cys Ala Gly Phe Phe Thr Asp Glu Glu Leu Ala Asp Gly Lys 
1330 1335 1340 

Gly Val rie Arg Thr Glu Glu Glu He Lys Ala Airg Ser Leu Val Gin 
1345 1350 1355 1360 

Lys Gin Arg Phe Asn Pro Leu Leu Asp Cys Pro Lys Thr Gin Phe Scsr 
1365 1370 1375 

Tyr Gly Asp He His Lys Leu Leu Thr Ala Asp He Glu Gly Cys Phe 
1380 1385 1390 

Giy Pro Ser His Ser Gly Val His Gin Pro Ser Leu Cys Phe Ala Ser 
1395 1400 1405 

Glu lys Phe Leu Met He Glu Gin VaI Ser Lys Val Asp Arg Thr Gly 
1410 1415 1420 

Gly Thr Trp Gly Leu Gly Leu He Glu Gly His Lys Gin Leu Glu Ala 
142S 1430 1435 1440 

A3p His Trp Tyr Phe Pro Cys His Phe Lys Gly Asp Gin Val Met Ala 
1445 1450 1455 

Gly Ser Leu Met Ala Glu Gly Cys Gly Gin Leu Leu Gin Phe Tyr Met 
1460 1465 1470 

Leu His Leu Gly Met His The Gin Thr Lys Asn Gly Arg Phe Gin Pro 
1475 1480 1485 

Leu Clu Asn Ala Ser Gin Gin Val Arg Cys Arg Gly Gin Val Leu Pro 
1490 1495 1500 

Gin Sec Gly Val Leu Thr Tyr Arg Met Glu Val Thr Glu He Gly Phi> 
1505 1510 151S 1520 

Ser Pro Arg Pro Tyr Ala Lys Ala Asn He Asp He Leu Leu Asn Gly 
1525 1530 1535 
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Lys Ala \fai Val Asp Phe Gin Asn Leu Gly Val Met lie Lys Glu Giu 
1540 1S45 1350 

Asp Glu Cys Thr Arg Tyr Pro Leu Leu Thr Glu Ser Thr Thr Ala Scr 
1555 1560 1565 

Thr Ala Gin Val Asn Ala Gin Thr Ser Ala Lys Lys Val Tyr Lys Pro 
1570 157S 15B0 

Ala Ser Val Asn Ala Pro Leu Met Ala Gin lie Pro Asp Leu Thr Lys 
1585 1590 1595 1600 

Glu Pro Asn Lys Gly Val He Pro He Ser His Val Glu Ala Pro He 
160S 1610 1615 

Thr Pro Asp Tyr Pro Asn Arg Val Pro Asp Thr Val Pro Phe Thr Pro 
1620 1625 1630 

Tyr His Met Ph« Glu the Ala Thr Gly Asn lie Glu Asn Cys Phe Gly 
1635 X640 1645 

Pro Glu Phe Ser He Tyr Arg Gly Met He Pro Pro Arg Thr Pro Cys 
1650 165S 1660 

Gly Asp Leu Gin Val Thr Thr Arg Val He Glu Val Asn Gly Lys Arg 
1665 1670 167S X6B0 

Gly Asp Phe Lys Lys Pro Ser Ser Cys He Ala Glu Tyr Glu Val Pro 
1685 1690 1695 

Ala Asp Ala Trp Tyr Phe Asp Lys Asn Scr His Gly Ala Val Met Pro 
1700 1705 1710 

Tyr Ser He Leu Met Glu lie Ser Leu Gin Pro Asn Gly Phe He Ser 
1715 1720 1725 

Gly Tyr net Gly Thr Thr Leu Gly Phe Pro Gly Leu Glu Leu Phe Phe 
1730 1735 1740 

Arg Asn Leu Asp Gly Ser Gly Glu Leu Leu Arg Glu V^l Asp Leu Axg 
1745 1750 1755 X760 

Gly Lys Thr He Arg Asn Asp Scr Atg Leu Leu Ser Thr Val Met Ala 
1765 1770 1775 

Gly Thr Asn He He Gin Ser Ph*£ Ser Phe Glu Leu Sec Thr Asp Gly 
1780 1785 1790 
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Glu Pro Phe Tyr Arg Gly Thr Ala Val Phe Gly Tyr Phe Lys Gly Asp 
X795 1800 1805 

Ala Leu Lys Asp Gin Leu Gly Leu Asp Asn Gly Lys Val Thr Gin Pro 
1810 1815 1820 

Trp His Val Ala Asn Gly Val Ala Ala Ser Thr Lys Val Asn Leu Leu 
1825 1830 1835 1840 

Asp Lys Ser Cys Arg His Phe Asn Ala Pro Ala Asn Gin Pro His Tyr 
1845 1850 1855 

Arg Leu Ala Gly Gly Gin Leu Asn Phe lie Asp Ser Val Glu lie Val 
I860 1865 1870 

Asp Asn Gly Gly Thr Glu Gly Leu Gly Tyr Leu Tyr Ala Glu Arg Thr 
1875 1880 1885 

lie Asp Pro Ser Asp Trp Phe Phe Gin Phe His Phe His Gin Asp Pro 
1890 1895 1900 

Val Met Pro Gly Ser Leu Gly Val Glu Ala He He Glu Thr Met Gin 
1905 1910 1915 1920 

Ala Tyr Ala He Ser Lys Asp Leu Gly Ala Asp Phe Lys Asn Pro Lys 
1925 1930 1935 

Phe Gly Gin He Leu Ser Asn He Lys Trp Lys Tyr Arg Gly Gin He 
1940 1945 1950 

Asn Pro Leu Asn Lys Gin Met Ser Met Asp Val Ser He Thr Ser He 
1955 I960 1965 

Lys Asp Glu Asp Gly Lys Lys Val He Thr Gly Asn Ala Ser Leu Ser 
1970 1975 1980 

Lys Asp Gly Leu Arg He Tyr Glu Val Phe Asp He Ala He Ser He 
1985 1990 1995 2000 



Glu Glu Ser Val 



<210> 10 
<211> 543 
<212> PRT 

<213> Shewanella putrefaciens 
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<400> 10 

Met Asn Pro Thr Ala Thr Asn Glu Met Leu Ser Pro Trp Pro Trp Ala 
1 5 10 15 

Val Thr Glu Ser Asn lie Ser Phe Asp Val Gin Val Met Glu Gin Gin 
20 25 30 

Leu Lys Asp Phe Ser Arg Ala Cys Tyr Val Val Asn His Ala Asp His 
35 40 45 

Gly Phe Gly lie Ala Gin Thr Ala Asp lie Val Thr Glu Gin Ala Ala 
50 55 60 

Asn Ser Thr Asp Leu Pro Val Ser Ala Phe Thr Pro Ala Leu Gly Thr 
65 70 75 80 

Glu Ser Leu Gly Asp Asn Asn Phe Arg Arg Val His Gly Val Lys Tyr 
85 90 95 

Ala Tyr Tyr Ala Gly Ala Met Ala Asn Gly lie Ser Ser Glu Glu Leu 
100 105 110 

Val 'lie Ala Leu Gly Gin Ala Gly lie Leu Cys Gly Ser Phe Gly Ala 
115 120 125 

Ala Gly Leu lie Pro Ser Arg Val Glu Ala Ala lie Asn Arg lie Gin 
130 135 140 

Ala Ala Leu Pro Asn Gly Pro Tyr Met Phe Asn Leu He His Ser Pro 
145 150 155 160 

Ser Glu Pro Ala Leu Glu Arg Gly Ser Val Glu Leu Phe Leu Lys His 
165 1*70 175 

Lys Val Arg Thr Val Glu Ala Ser Ala Phe Leu Gly Leu Thr Pro Gin 
180 185 190 

He Val Tyr Tyr Arg Ala Ala Gly Leu Ser Arg Asp Ala Gin Gly Lys 
195 200 205 

Val Val Val Gly Asn Lys Val He Ala Lys Val Ser Arg Thr Glu Val 
210 215 220 

Ala Glu Lys Phe Met Met Pro Ala Pro Ala Lys Met Leu Gin Lys Leu 
225 230 235 240 



Val Asp Asp Gly Ser He Thr Ala Glu Gin Met Glu Leu Ala Gin Leu 
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245 250 2S5 

Val Pro Met Ala Asp Asp lie Thx Ala Glu Ala Aap Ser Gly Gly His 
260 265 270 

Thr Asp Asn Arg Pro Leu Val Thr Leu Leu Pro Thr lie Leu Ala Leu 
275 280 285 

Lys Glu Glu lie Gin Ala Lya Tyx Gin Tyr Asp Thr Pro He Axrg Val 
290 295 300 

Gly Cys Gly Gly Gly Val Gly Thr Pro Asp Ala Ala Leu Ala Thr Phe 
30S 310 315 320 

Asn Met Gly Ala Ala Tyr lie Val Thr Gly Ser He Asn Gin Ala Cys 
325 330 335 

Val Glu Ala Gly Ala Ser Asp His Thr Arg Lya Leu Leu Ala Tlir Thr 
340 345 350 

Glu Met Ala Asp Val Thr Met Ala Pro Ala Ala Asp Met Phe Glu Met 
355 360 36S 

Gly Val Lys Leu Gin Val Val Lys Arg Gly Thr Leu Phe Pro Met Arg 
370 375 380 

Ala Asn Lys Leu Tyr Glu lie Tyr Thr Arg Tyr Asp Ser He Glu Ala 
385 390 395 400 

He Pro Leu Asp Glu Arg Glu Lys Leu Glu Lys Gin Val Phe Arg Ser 
405 410 415 

Ser Leu Asp Glu He Trp Ala Gly Thr Val Ala His Phe Asn Glu Arg 
420 425 430 

Asp Pro Lys Gin He Glu Arg Ala Glu Gly Asn Pco Lys Arg Lys Met 
435 440 445 

Ala Leu lie Phe Arg Trp Tyr Leu Gly Leu Ser Ser Arg Trp Ser Asn 
450 455 460 

Ser Gly Glu Val Gly Atg Glu Met Asp Tyr Gin He Trp Ala Gly Pro 
465 470 475 480 

Ala Leu Gly Ala Phe Asn Gin Trp Ala Lys Gly Ser Tyr Leu Asp Asn 
485 490 495 

Tyr Gin Asp Arg Asn Ala Val Asp Leu Ala Lys His Leu Met Tyr Gly 
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500 



505 



SIO 



Ala Ala Tyr Leu Asn Arg lie Asn Ser Lea Thr Ala Gin Gly Val Lys 
515 520 525 



Val Pro Ala Gin Lea Leu Arg Trp Lys Pro Asn Gin Arg Met Ala 
530 535 540 



<2Xa> 11 
<211> 499 
<212> PRT 

<213> Shewanella putrefaciens 
<400> 11 

Hct Arg Lys Pro Leu Gin Thr lie Aan Tyr Asp Tyr Ala Val Trp Asp 
15 10 15 

Arg Thr Tyr Ser Tyr Met Lya Ser Asn Ser Ala Ser Ala Lys Arg Tyr 
20 25 30 

Tyr Glu Lys His Glu Tyr Pro Asp Asp Thr Phe Lys Ser Leu Lys Val 
35 40 45 

Asp Gly Val Phe lie Phe Asn Arg Thr Asn Gin Pro Val Phe Ser Lys 
50 55 €0 

Gly Phe Asn His Arg Asn Asp Xle Pro Leu Val Phe Glu Leu Thr Asp 
65 70 75 80 

Phe Lys Gin His Pro Gin Asn lie Ala Leu Ser Pro Gin Thr Lys Gin 
85 90 95 

Ala His Pro Pro Ala Ser Lys Pro Leu Asp Ser Pro Asp Asp Val Pro 
100 105 110 

Ser Thr His Gly Val lie Ala Thr Arg Tyr Gly Pro Ala He Tyr Tyr 
115 120 X25 

Ser Ser Thr Ser lie Leu Lys Ser Asp Arg Ser Gly Ser Gin Leu Gly 
130 135 140 

Tyr Leu Val Phe He Arg Leu He Asp Glu Trp Phe He Ala Giu Leu 
145 150 155 160 

Ser Gin Tyr Thr Ala Ala Gly Val Glu He Ala Met Ala Asp Ala Ala 
165 170 175 
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Asp AXa CIn Leu Ala Acg Leu Gly Ala Asn Thr Lys Leu Asn Lys Val 
180 185 190 

Thr Ala Thr Ser Glu Arg Leu lie Thr Asn Val Asp Gly Lys Pro Leu 
19S 200 205 

Leu Lys Leu Val Leu Tyr His Thr Asn Asn Gin Pro Pro Pro Met Leu 
210 215 220 

Asp Tyr Ser lie He He Leu Leu Val Glu Met Ser Phe Leu Leu He 
225 230 235 240 

Leu Ala Tyr Phe Leu Tyr Ser Tyr Phe Leu Val Arg Pro Val Arg Lys 
245 250 255 

Leu Ala Ser Asp He Lys Lys Met Asp Lys Ser Arg Glu lie Lys Lys 
260 265 270 

Leu Arg Tyr His Tyr Pro He Thr Glu Leu Val Lys Val Ala Thr His 

275 280 aes 

Phe Asn Ala Leu Met Gly Thr He Gin Glu Gin Thr Lys Gin JLeu Asn 
290 295 300 

Glu Gin. Val Phe He Asp Lys Leu Thr Asn He Pro Asn Arg Arg Ala 
305 310 315 320 

Phe Glu Gin Arg Leu Glu Thr Tyr Cys Gin Leu Leu Ala Arg Gin Gin 
325 330 335 

He Gly Phe Thr Leu He He Ala Asp Val Asp His Phe Lys Glu Tyr 
340 345 350 

Asn Asp Thr Leu Gly His Leu Ala Gly Asp Glu Ala Leu He Lys Val 
355 360 365 

Ala Gin Thr Leu Ser Gin Gin Phe Tyr Arg Ala Glu Asp He Cys Ala 
370 375 380 

Arg Phe Gly Gly Glu Glu Phe He Met Leu Phe Arg Asp He Pro Asp 
385 390 395 400 

Glu Pro Leu Gin Arg Lys Leu Asp Ala Met Leu His Ser Phe Ala Glu 
405 410 415 

Leu Asn Leu Pro His Pro Asn Ser Ser Thr Ala Asn Tyr Val Thr Val 
420 425 430 
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Ser Leu Gly Val C y5 Thr Val Vai 
435 440 

Ser GIu Ser His lie lie Gly Ser 
450 455 

Ala Leu Tyr Hts Ala Lys Ala Cys 
465 470 

Thr Thr lie Thr Val Asp Glu lie 
485 

Gly His Gin 



Ala Val Asp Asp Phe Glu Phe Lys 
445 

Gin Ala Ala Leu XI e Ala Asp Lys 
460 

Gly Arg Asn Gin Ala Leu Set Lys 
475 480 

Glu Gin Leu Glu Ala Asn Lys He 
490 495 



<210> 12 
<211> 40138 
<212> DNA 

<213> Vibrio marinus 
<400>»12 

aatagatcga crcgcaaaag ttgcrtaaga cagtgtcaat atagcttctr atutgtaaac 60 
actgttttrt atgtgtaaac atgtttagtg tgtgcaaatg ctgttaatta tcctcttggg 120 
attgtaatag ctgatgttgc tggctaatga gtacttttag ttcggcaata tcttgcttta 180 
aatcgctaac ttcagctttt aattcaccca cacttgttgt atttttaagg ctctcttccc 240 
caccaiicgac aaaccaggat gatatgaaac cggnaaacgt accaaagaga ccgacacctig 300 
cagrcatgag taatgccgca angatacgtc cgccagtggt gacggggtag tagtcaecgt 360 
aaccaacagt cgttattgtc acaaatgacc accaaagcgc gtcgatgccg ttattgatgt 420 
tactgcctac ttgaccetgt tctaacaata- aaataccgat agcaccaaag gtgacaagga 480 
tgaaggatat cgcagatacc agcgaaaagg tggctttaaa ccgatgttca aaaatcattt 540 
ttaagataat ttrtgacgag cgtatattct gaatagatct taatactcta gcgatacgaa 600 
ttatgcgaat aaactgcagt tgctcgacca tcggaatact cgacagtagg rcaatccaac 660 
cccatttcat aaactgaaat ttattctcag cttggtgaaa gcgaattaca aagtcagtga 720 
aaaagaataa gcaaatcgta ttatctacgc tcgttaatac ttcagtgacg ttacttgaaa 780 
aggtaaaaat aagttgcagt agtgatgaca cgaccacatg aagtgataaa ataagcatga B40 
aaatctgaaa tggatctaca tcactgttgt trttggtgcc acttctaagg cccgttttca 900 
caatctgctg cctcggttca ttgattttgt caatataaac ctcagtcagc agcaagacaa 960 
aatatattta catcaatgcc atcgtatcat tcaaccgcgc gtcgtgtacc cagaccaaga 1020 
tcgtcgtata tgttagtcat gtagcgatga gattaccatg cgacaggaga gaattatgtt lOBO 
tgttattatt ttttacgtac ctaaagttaa tgttgaagaa gtaaaacagg cgttatttaa 1140 
cgtcggagct ggcaccatcg gtgattatga tagttgtgct tggcaatgtt tggggactgg 1200 
gcagttccaa ccttcacttg gtagccagcc acatattggt aagctaaatg aggttgaatt 1260 
cgtcgatgag tttagagtag aaatggcttg tcgagcagaa aatgcaaggg cagcaataaa 1320 
tgcacttatt gctgcgcacc cttatgaaga acctgcttat catattctgc aaacattgaa 1300 
tcttgacgag ttaccttaag ttagacgcac tgcacttaat cqgttcgccg tgctaggtta 1440 
gcaatcagca attttgacca tgttagcgac agctttggca caagtgatcg atattaaact 1500 
atccgactca gatcccattt ttactgccga atcaggtttc attacacttg ttctagtggt 1560 
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ttttcccgac aggtgtaact ccgttacttg cgtaaggttg ataacctcta ccgcattggc 1620 
aggagttaca cctgcaccag gcacaatact aattctacca tctgcttggt taactaacgt 1680 
tcggattaag gcgcagcctt ctagcgcctg agcttgttga ccagaggtta aaanacgctc 1740 
acaaccagca gtgatcaagg tctccaaggc ttgttgtgga tcatcacaca agtcgaaagc 1800 
gcggtggaag gttacgccga gatcacgtga tgccaccatt aagcgtttta aagctggctc 1860 
gtcaatatra ccatctgctg ttaacgcgcc aataacgacc ccrtggacac cgagtaactt 1920 
catgaatttg atgticggaaa ccataatatc aacttcttgt tcgcratata caaaatcacc ISfiO 
ggcgcgaggg cgaataatgg cataaatggg gatcgttgct agatcaatag acctttgtac 2040 
aaaacctgcg ttggcggtca agccacctaa tgctaatgcc gagcacaact caatacgatc 2100 
ggcgccagat gcttgagccg tcagcagcga ttctatatta tcgacacata cttccattgt 2160 
cattgtcara tacctctett taaaaagttt attaaaaata ataaagccag cataagtcgt 2220 
ttcatacaat atgaaagggg aaaaggcgac ttagctcgcc tagatcaatt aT:tatggcag 22B0 
aatactgccg tattgtgatt agaaagacag ttttttaagc tcaatagccg rtatcgcgtc 2340 
gttatccacc atcgtgcaac ttttctggcc tgggtgcttt attaacaccg trtcagtggc 2400 
tggattaggg rgaaargatt cttttttcaa atctgctttt ttgtatttga acgtacctgt 2460 
aatgtcctgd tgctcacgaa gacgtacaaa tattggttgc gcatagcttig gtagtgccgc 2520 
attgacatgc tgacagaatt: cagacgctga aaatrcatga atagggcaat Ccaaagtcag 2580 
cgcgaccatg cctgctcggc catcgtgatg tgggagcttg acaccat:aag ccacactttg 2640 
ctcaattrgc acaaaatpgt; taactlLgagc tCctacttgc gtcgtggcga cattttcacc 2700 
tctccagcgg aatgtLa^cac ctaaticcatc cacaaaggaa atatggcgar^ aAccrtgg^a 2760 
atgaacgaga tcgccrggtat taaaacaaca gtcaccgtct ttraatactg actraaacag 2820 
ctttttatta ctttcgttgt catcggtata accatcaaat ggtigaacgtt: ragetatctt 2890 
tgccagcagtr agccctgttt ctcccgtrtt tacttnggtc attttccctt ccgcarraHa 2940 
cacaggtctg tc^tftigtcaa ta^cata^tg tatgacggta aaagcaagtg gagt:aacccc 3000 
cgctgtatgc ggcaag,t:rca gcgpa-ttgga gaacacaaga ttacacccac tggcgceata 3060 
gaat^c^tta, at^tgcvcga rcccaaaacg ttgttggaaa tgatcccaaa - ^tztcggggcg 3120 
taatccatta cctatg^ttt tctttaratt atgcrgtttg tcttrattgc raggbggtac 3180 
^tt.taataaa taacggcaga gct;cgpcgat gtaagtaaac gcagtggcat: tacgageadg 3240 
aacttcatcc caaaagcgac ttgaactgaa tttttcagaa agcgcgaggg ttgctgcgct 3300 
accaaacacg gcgcttaatg acactgtcag tgcattgtta tggtataggg ggagtgatiaa 3360 
atacaataca tcaccagctig ttaagcgtaa tgatgccatc cccatgcctg ccatggattt 3420 
aaaccaacgg tgatggctca ttcttgctgc ttttggcagt ccagrttttc ccgaggtaaa 3480 
gatiaraaaac gcgcaatgct; taagctgtat ttgtgctgtt gactcagggt tcaaractga 3540 
acatcccgcg. actagrgtag atatgttttt ataaccatca ctcatgtctg gcgtttctaa 3600 
agcgggtacg taaaagacat tctgttgtaa tgtcgatgac aaatcggttt caacatcatt 3660 
aatggcggat gtg^atagtt catctgcgat gagcaatttg gtatcgacca cgcraagact 3720 
atgttcgagg attgaatccc: gttgtgtcgt atttatcata caagcaatcg cgccaagctt 3780 
gacaactgcg agggcaataa tgatggcttc aggcctgtta Ccgagcatga tggcgacttt 3840 
atcattttta ccaacgccgt attcatgaag gaaatgggca tattgatctg cttgcttatt 3900 
caatgaatcg taactataac gctggtcttc aaattgtatt gcgatcaagt cagagttatt 3960 
gacagcttgc tgccctagta ataaaccaat agacataaaa cgttcgggct ttgcttgttg 4020 
taagtgccat aagcctttga tgattggctc tggggttttt aacagattga tggtactrtt 4080 
caggaattgt ttgccggtta taacagtcat aagctaattc tttttatcaa gaagaggggt 4140 
tatgacacca aataaatggg tcacgcgttg gtttaatttg gttagactaa atgtgttgtt 4200 
ttgctgtgat aatgcgacgt Ccaaacaaac tcgagaaggt aaaaaaatag catttttaaa 4260 
tcgaacatca atactaatgt gttgaatatc aatcaagttt tctaactgtg cgagcacgcg 4320 
tgctxtagca aacatgccat gtgctattgc tgttttaaac cccattagct tcgctgggat 4380 
aaaatgtaaa tggactggat ttgtgtcttt ggagatataa gcatatttat atacgtcaaa 4440 
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aggactaaat ttaaacaatg aaatcggctc gtaagcataa tccgctggcg tatttactat 4500 
tttctcaccg ctggaacgtt gagatcgttg gcacgttttt cgctgtttcg ttctctgtaa 4560 
gaatgtcgat gtacactccc acgcaaattg tccatctaca aacacatcaa tatgagtatc 4620 
aatgaaacgt cctgtatccg ttatgtactc cttaattaca cgacatgtgc tcgtcaatat 4680 
cgcgrttaat gctatcggtt gatgttgtgt tatgcgattt cgataatgga ctagtcctaa 474 0 
taragatatc ggaaattgtg ttgatgtcat gagtttcatc aataatggaa agatcatcac 4800 
aaatggataa gtaaccggta catagtttgt gttattaaac ccacagcatt taatatatcg 4B60 
ccttaaattt cgccgatcta ttttttgtcc actgacacta aattgctcag tacacacttg 4920 
tgtcgaccaa gcgttcatca gtgttctaac aattgtattg accactgctt tcacatataa 4980 
aagcgagata atcggttgct ttgtcaacag tgtgatctgg ttagcgtgca ttgaaataat S04 0 
tcatataaga gcatigtagca tctatgttaa tactttgttt tggaagttga attggcgaat 5100 
ccgtaatcgg ttt;atggcag ttcggtcaaa tacctcaggt aaactcgtta ctcataccat 5160 
tgatagtgtt aaagtgattg actgaataaa gaatagagct aaaagtggaa aaattatgca 522 0 
agatgcgggt atgttartac gcattgctta tgaggcaatg aaagagttag aggttgatgt 528 0 
cattgaagta ctttctcgtt gtaacataag tgaagaagta ctgaatgata aggaT:cttcg 534 0 
cacacctaat cacgcacaaa cacatttttg gcaagtatca gaagacatat cacaagatcc 5400 
taacatcggc attccacttg gtgagagaac gccagtgttc acggggcagg tattacagta 5460 
tctrtttctc agtiagtccta cacttggtac tggctgggaa cgcgcaacaa aaractttcg S520 
attaatcagt gatgcggcga gtgtttctat caagatggaa ggccgtgaag cgcgattatc 5580 
tgcgaactta gatiggtttag cggaagatgc gaatcgtcat ttgaacgatt gcctagcgat 564 0 
cggtgcattt aaattttgtt tatatgrgac agaaggcgaa tttaaagtaa gcaaaatagc 5700 
ctccgctcat gctcgcccga aagatattac tgcctatacc aatgtatna catgtccgat 5760 
tgagtttgct gccgaagata attatattta: tctcgatgct gatttacccg aacgtccttc 5820 
tccgcatgcg . gagcctgagc tactcgcctt. acacgatcag cttgcaagcc gcaaaacagc 588 0 
caag^tagaa ctgcaagatt tagtggaraa agtacgcaag gttattgcac aacaacrtga 594 0 
gtctggtgtg. gtigacrtcag aaagca^cgc .cact:gaactt gacaCgaaac cacgtatgct 6000 
aagagcgaag ttagccgaca trgattataa crrtaatcaa atactcgctg actttcgrtg 6060 
cgagttatca aaaaaactgt tggcgaatae ggacgagtct attgatcaga ttgtccatct 6120 
cactggtttt tctgaaccaa gtacttttta tcgtgccttt aagcgctggg ttaaaatgac 618 0 
gccaattgaa tatcgccgta gcaaactcgc ggCtaggcat gctaaccaac acgagtccta 624 0 
aaaattcgcr gcttagtgca tagtgcatag tgcatagtgc tagtaagcca agtacaaagc 6300 
gttaaagcta agtacttgag cgaaccatca gacaccacct actagattaa gcacctatta 63S0 
atgartgacc acaaattctg atcgtattgc crgtgatccc tgcagcttga ggttgcgcaa 6420 
aaaaagctat cgcttcagca acaccaactg gcttaccacc ttgttttaat gaattcatac 648 0 
gacgaccagc ttcacgaact gtaaatggaa tcgctgctgt catttttgtt tcaataaagc 65.4 0 
ctiggtgcaac agcatraarg gtgatgtatt tgtctgcaag cggagttcgc artgcatcaa 6600 
cataaccaat gactgcggcc ttagacgCtg cataattagt ctgaccaaag ttacccgcaa 6660 
tcccactcat cgaagacaca caaacaatgc ggccatagtc gtCgagcaga tcatcattta 6720 
gcagtcgctc attgattctt tccattgccg acaagttaat atccatcagt acatcccaat 678 0 
ggttatccgg catacgtgct agcgttttgt cttttgttac cccggcatta tggacgatga 6840 
tatcaagcga ctgttctcgc acaaagtcag caatgacatt tggggcgtca gcagcggtaa 690O 
tatcagcaac aatgctgcta cctttcaagc aatgagctac tt^ctccaagg tcctgtttta 6960 
atgccggaat gtctaagcaa ataacatgtg cgccatcacg ggcgagtgtc tcagcaatag 7 02 0 
cagccccgat gccacgtgat gcaccagtga caagtgctgt ctttccttgt aatggttttg 7080 
ccgtgttact tgtttcgtta ataacttcgt taataacttc gctaataact tcgttaatag 7140 
ccccattaat cgaaccgggt tttacgttaa taacctgtgc tgagatatag gctgattttg 720O 
ctgaggttaa gaaacgtagc ggggcctcta acaattgctc accaccaggt tgtacataga 7260 
taagttgaca ggtactacca ttcttgccta cttctttggc gacactgcga caaaaccctt 7320 
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ctaaagacct Ctgtacagtc gcgtagctta catcgtcaag atgttcaccc ggacgaccta 7380 
acacgatcac tctgctgcat ggcgagagct gcttaactac aggtcgaaaa aaacgatgta 7440 
atgcacttaa tcgcttgctg ttcttaatgc ctgaggcgtc gaagataaca ccgctgaagc 7500 
gacctgttct agcgatagca ttaaggcCda taggtgtcgc gactaaagac gtttgattaa 7560 
attcaatatt aagatcggct aacgctgacg tgttattagg ataagaaatc gtgacttcag 7620 
catctttaaa tgtgttaaga atgggtttaa ctaatttgct gttgctggct gcgccgatga 7680 
gcaagttgcc agagatgaga tcggttccct gatcgtagcg cgctaacgta accggtcgtg 7740 
gcagatt^ag cgctttaaat aaacctgatg tccacttgcc attagcgagt tttgcgtatg, 7800 
tatccgtcat tttctaatcc tcgttatagn gaacagtttg aatctcgaag atgtacacgt 78 60 
gttaaaaatt atctgatagc tatgacttat ctgccaccac gtaataataa aCagaccagt 7920 
tcattacatc gtraatcgat atagtataac taaatactaa graaattata atgataagac 7980 
tgttatcgta ctcggatcaa actctgacca gcaaataatc aaattagagt ttttatttta fl040 
aacttgtatc aacaatgtta cattaacgta tctcacgtct aacgcgctac gggcatattt 8100 
aagtcactaa actaaaggaa taaaccatga caggtcaaac aataagaaga gtagcaaCta 8160 
tcggcggtaa ccgtatcccg tttgcacgtt caaatacagc gcattcaaaa ccaagtaacc 8220 
aagatatgct gacggaaact atccgtggct tggtggttaa acataaccca cgtggtgaac 8280 
aaccggggga agtCgttgct ggtgcggtaa ttaagcattc tcgngattct aacttaacac 8340 
gtgaagccgt gctaagcgca ggtcttgcac ctgaaacgcc ttgtcacgac attcaacaag 84 00 
ctrgtggtac tggtctagct gcagctatcc aagcagcaaa caaaattgcg cttggtcaaa 8460 
tagaagcggg Catcgctggt ggttctgata cgacaccaga tgcaccgatt gcagtcagtg 8520 
aaggcatgcg tagcgtatta cttgagctta atcgagctaa aacgggtaag caacgtttga 058 0 
aagcaccatc tcgt'ctacgt ctaaaacact ttgcgccact aacgtictgca aataaagagc 8640 
cgcgtaccaa aanggcgatig ggcgatcatt gtcaagtaac agcgaaagag tggaatatict 8700 
cacgtgaagc acaagacgca ttggettgcg caagtcatca aaaatragct gcagcatatg 8760 
aagaaggttt ctttgatacg ttagtttcac ctatggccgg ottaabgaaa gataacgtat: 8820 
tacgcgcaga- tacaacagtt gagaaactgg ctaaattgaa acctrgtttt gataaagtaa 8680 
acggcaatat gacggcgggt aacagtiacta accttaccga tggagcacca gctgtattac 894 0 
tcgcaagtga agaatgggca gcggcacaCa acttaccagt acaagctcac ctaacatttg 9O00 
gcgaaacggc cgccatcgac ttcgctgata agaaagaagg tctgttaatg gcgcctgcat 9060 
acgcagtgcc aaaaatgttg aagcgtgctg gccttacatt acaagacttc gattactatg 9120 
aaatacatga agcatttgct gcgcagttat tagcaacgct agcagcttgg gaagacgaaa 9180 
aattctgtaa agaaaaactg ggtctagacg ctgcgcccgg ttcaattgat atgaccaagt 924 0 
taaacgtgaa agggagtagc ttagccacgg gccacccact tgccgcaact ggtggrcgtg 9300 
ttgtcgctac gctagcgcaa ttactcgatc agaaaggttc aggtcgtggt tcgatctcga 9360 
tttgtgctgc tggcggtcaa ggtatcacgg caattttaga gaaataaacg cactgtttat 94-20 
tatctattga ttaagctgtc ctgagatact ggatatcttt aaataaaacg ccaatactgc 9480 
agagtattgg cgtttttttg taataccaac tcctatataa cggtgcattt taaacactta 9540 
acttccggca ttggtatcat aaaaaagcag caccgaagtg ctgcttgact gCagattaac 9600 
ctattaaaat agagaggcta gaattagtct tcgtatgcct cattacgtac gccagctgca 9660 
cgacccgatg gatcagcatt gtcttggaaa ctttcatccc aagctaatgc ttctacagtt 9720 
gaacaagcaa cggatttacc aaacggtacg cattccgctg ctgaatcacc tgggaagtga 9780 
tcctcaaaga tggcacgata gtagtaacct tctttcgtat ctggtgtgtt aattgggaac 9840 
ttaaatgctg cacttgctaa catttgatca gttaccgcrt cttcaacgtg tactttaagt 9900 
tggccaatcc aagaataacc aacaccatca gagaattgct ctttttgacg ccatacaatt 9960 
tcctcaggta gtaaatcttc aaatgcttcc cgaatgatgt ttttctcaat gcggtcqccc 1002O 
gtgatcattt ttagttcagg gtttagacgc attgacgcat caacaaactc tttatctaag 10080 
aaaggaacac gtgcttcgat gccccaagcc gccacagact tgtttgcacg taagcaatca 1014O 
aacatacgta atttatttac tttacgtacc gtctcttcat ggaactctct cgcatttggc 1O20O 
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gctttgtgga agtacaagta accaccgaac agttcatcag caccttcacc agaaagcacc 102 60 
atcttaatcc ccatggcttt aattttacgt: gccattaggt acataggggt tgatgcacga 10320 
atcgttgtta catcgtaggt ttcaatgtgg taaaccacgt cgcgtaaagc gtcgatacct 10380 
tcttgcacag taaattcaat tgaatgatgg atagtaccta agtgatctgc cactttttgt 10440 
gcagcggcta aatctggaga accatttagg cctacagaga aagagtgtag ttgtggccac 10500 
catgcttcgg ttttaccacc gtcttcaata cgacgttttg catactgttg ggtgattgct 10S60 
gaaataacag atgaatctaa cccgcctgar aataacacgc cgtaaggtac atcacacatt 10€20 
aattgacgtt taactgcatc ttccaaacct tgcttaacaa cgcttttatc accaccattt 10660 
tgtgcaacgt tatcaaaatc tttccaatca cgttgaCaat aaggcgtgac tacaccaccc 10T40 
Ctactccaca ggtaatgacc tgctgggaat tctrcaattt gagtacaaat tggcactiagt 10900 
gctttcattt cagaggcaac ataaaagcta ccgtgttcat catagcccgt ataaagaggg 108 60 
atgataccga tatggtcacg gccaaccagg taagcgtcct ctgtttcgtc atataaagcg 10920 
aaagcaaaaa taccatttag atcatctaaa aatitgtgtgc ctttttcttt atatagcgca 1O9B0 
agtatcactt cgcaatctga ttctgtttgg aattcaaagt cracgttcag cgttttcttt 11040 
aaatctttgt ggttataaat ttcaccatta acagcaagta cgt-gtgtctt: ttcttcatta 11100 
tatagcggct gtgcaccatt atttacatcg acaatagcaa gacgttcatg aactaaaata 11160 
gcattgtcac ttgtatagat acctgaccaa tctgggccgc ggtgacgtag taactttgat 11220 
agttctagtg cttgtircgcg aagaggttta atgtctgatt tgatgtctag aatcccgaat 11280 
attgagcaca taactaattc cttctggggc tgcgtctgca gctaaccttc taaatagtgt 11340 
gtctaatttg ccacattgta gatttaatgc aaacatiraat gataaaacat: ttataaaaaa 11400 
cgtaattcaa cgtggaaccg ataatttaat ggcttaAaag tgaagatcca tr^aactgtga 11460 
tggcgaggtg atagacaaat gcagacctta atgaataaag caggcacgat tgaatccatt 11520 
caacgcaaag tggtactaac cattgrttCa aacgttiaraa atagtgtttt aaaggttara 11560 
agraaaCaat tua^aaaoaa taataatcca catgcaccaa attcatcatg atiaaaccgct 1X640 
atatctcaat ggcaatt^tgg gataagtgta aaataratgt aaaatgaatg agttgacctg inoo 
ctttttttac actaagcgat gaaattaaag ctiagatgtcg ttgttagcai:: tgatcaataa ll'^SO 
cgtactaaaa tacgac%^ct agtatagaaa tttiaaaaaac agttggtttt gatagcataa 11820 
ctgcataaac taaccagctt attgtctgca acatttttgt aatttaaata ggttitaataa 11880 
aattatargt ctgataaata taaaccgtac gacctttcct craaaaagac gtttttgctg 11940 
cctaagttcr ggcccgrgtg gttcggggtg tttgcaatat acttattagc ttttatgcca 12000 
gtaaagccgc gcgataaatt tgctcgattc atagcgaaga aattgtttag cccaaaaatg 12060 
atggcaaagc gtiaaaaaggc agcaaagatc aatttatcta tgtgcttccc tgaaatggat 12120 
gatacggaac aagaccgtat aatcatggcc aacctagtta ctttttgtca aactatctta 12180 
agttatgcag agccaagtgc gcgtagtcgt gcctataacc gtgaccgtat gatagtgcat 12240 
ggtggcgaga acttatttcc gctacttgaa caaggtaagg cttgtatctt attagtgccg 12300 
catagcttcg ctattgattt tgcaggttta cacattgctt cctatggcgc gccattttgt 12360 
actatgttta acaattctga gaatgagttg ttcgattggc tgatgacacg tcaacgcgct 12420 
atgtttggag gcactgttta tcaccgcaag gcagggctag gggctctagt Caaatcactt 12480 
aagagcggtg aaagctgtta ttacttacct gatgaagacc atggacctaa gcgcagtgta 12540 
tttgcgcctt tatttgcgac tcaaaaagca actttacctg taatgggcaa gccagcagaa 12600 
aaaacaaatg cactcgtcgt tcctgtttat gcggcatata atgaatcact aggtaaattt X2660 
gaaaccttta rtcgaccagc aatgcaaaac tttccaccag aaagcccaga acaagatgca 12720 
gtgatgatga ataaagagat tgaagccttg attgaatgtg gtgctgatca atatatgtgg 12780 
acacttagat tattgagaac acgtccggac ggraaaaaaa tctactaata aagtttaata 12840 
aacaccataa ccttcgttga acatggtgtt tacccccctg aataccctct aaattaataa 12900 
caaaaaaagc catttacgta acacctaatg atgatttagc ctgcacttgc tttgttttta 12960 
gtcttaagag cctaataaac ttgatctagg catagattct gtctttcttt acgtaacgcg 13020 
atctattttt tttaaccgat agctgttata actagtctca tacgaaagag atatcgtttc 13080 
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agtaaaagct attncgtttc aatagataat ttatttatag tcatattttc tgtaatgaca 13140 
atcattttct catctagact atagataaga atacgaatta agtaagaaca ttaattttac 13200 
aagaatataa aatatcccat cggagctata agaatgaaaa agactaaaat tgtttgtaca 13260 
attggtccaa aaactgaatc agtagagaaa ctaacagagc ttgttaatgc aggcatgaac 13320 
gttatgcgtt taaatttctc tcatggtaac tttgctgaac attcagtgcg tattcaaaat 13380 
atccgtcaag taagtgaaaa cctgaataag aaaattgctg ttttactgga tactaaaggt 13440 
ccagaaatcc gtacgattaa actagaaaac ggtgacgatg taatgttgac cgctggtcag 13500 
tcattcacgt ttacaacaga cattaacgtg gtaggtaata aagactgtgt tgctgtaaca 13560 
tatgctggtt ttgctaaaga ccttaatcct ggtgcaatca tccttgttga tgatggttta 13620 
attgaaatgg aagttgttgc aacaactgac accgaagtta aatgtacagt attaaatact 13680 
ggtgcacttg gtgaaaataa aggcgttaac ttacctaaca tcagtgtagg tctacctgca 13740 
ttgtcagaaa aagataaagc tgatttagcg tttggttgtg agcaagaagt tgattttgtt 13800 
gctgcatcat ttattcgtaa ggctgatgat gtaagagaaa ttcgtgaaat cctatttaat 13860 
aatggtggcg aaaacattca gattatctcg aaaattgaaa accaagaagg tgtagacaat 13920 
ttcgatgaaa tcttagctga atcagacggt atcatggttg ctcgtggcga tctcggtgtt 13980 
gagatcccag ttgaagaagt gatcatggca cagaagatga tgatcaaaaa atgtaataaa 14040 
gcaggtaaag ttgtaattac tgcaacacaa atgcttgatt caatgatcag taacccacgt 14100 
ccaacacgtg cagaagcggg cgatgttgcc aatgctgtgc ttgacggtac cgacgcggta 14160 
atgctttctg gtgaaactgc gaaaggtaaa tacccagttg aagctgtgtc tatcatggca 14220 
aacatctgtg aacgtactga taactcaatg tcttcggatt taggtgcgaa cattgttgct 14280 
aaaagcatgc gcattacaga agctgtgtgt aaaggtgcgg tagaaacaac agaaaaattg 14340 
tgtgx:tccac ttattgttgt tgcaactcgt ggcggtaaat cagcaaaatc tgttcgtaaa 14400 
tacttcccga aagcaaatat tcttgctatc acaacaaatg aaaaagcagc gcaacagtta 14460 
tgcctaacta aaggcgtaag cagctgcatc gttgagcaga ttgatagcac tgatgagttc 14520 
taccgtaaag gtaaagagct tgcattagca actggtttag ctaaagaagg cgatatcgtt 14580 
gttatggtat caggtgcgtt agtaccatca ggtacaacga atacggcatc tgttcaccaa 14640 
ctttaagttg ccatattgat attataaaaa agagagcgta tgctctcttt ttttatatct 14700 
gtagtttata tgtctgtaca aaaaaatgat aaagagtaca taaactatta atatagcgta 14760 
atatataatg attaacggtg atgaaagggt taaataaatg gatagtgcta aacataaaat 14820 
tggcttagtc ctttctggcg gtggtgcgaa aggtattgct catcttggtg tattaaaata 14880 
cctgttagag caagatataa gaccgaatgt aattgcgggt acaagtgctg gctctatggt 14940 
tggtgcactt tattgctcag gacttgagat tgatgacatt ttacaattct tcatcgatgt 15000 
aaaacctttt tcttggaagt ttacccgtgc ccgtgctggc tttatagacc cggcaaaatt 15060 
atatcctgaa gtgctaaaat atatccccga ggatagcttt gagtaccttc aacctgaatt 15120 
gcgcattgtt gccaccaaca tgttactcgg taaagagcat atatttaaag atggctccgt 15180 
gattaatgcc ttattagcat cagccagcta ccctttagtt ttttctccga tgatcattga 152 40 
cgatcaagtg tattcagatg gcggtattgt taatcatttc cccgtgagtg tcattgaaga 15300 
tgattgcgat aaaataatcg gcgtatacgt gtcgcccatt cgtcaggtcg aagctgacga 15360 
actctcgagt ataaaagacg tggtattacg tgcgttcacg ctgcagggta gtggtgctga 15420 
attagataaa ctatcgcaat gtgatgtgca aatttatcca gaagcgctat tgaattacaa 15480 
tacgtttgca accgatgaaa aatcattacg ggagatctac cagattggtt atgatgctgc 15540 
aaaagatcaa catgacaacc ttatggcatt gaaagaaagt atcaccacca gcgaggttaa 15600 
aaagaacgtc tttagcaaat ggtttggtga taaacttgct agcaacagcg gcaaatagcg 15660 
gcccacacgg atttatacac taggataatg ggcgttaata gcctcactgt cgttgtgtgg 15720 
tctctaattt tagctaaatc ttgtgttata ctgacttcct attaatcata aacgatttat 15780 
cacggtaaac atgactcaaa taaataaccc gcttcacggc atgacactcg aaaaagtaat 15840 
taacagtctc gttgaacaat atggctggga tggtcttgga tactacatca acattcgttg 15900 
ctttactgaa aatccaagtg ttaagtctag tcttaaattt ttacgtaaaa ccccttgggc 15960 
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actgataaa gtagaagcgc tatatatcaa aatggtgact gaaggctaac tgtctccacg 16020 
cagcgaacc gctgtttata g.taatataa g.actataag cagggctcgt taat.cagta 
tgtaattaat cctgaacacc tccgctta.t tcaacat.gt actctctaga taacac c 
lattacac cttcaacatc acagcc.cca cataacatcc gatgacatag c=ctgttatt 
tttcacattt atctatatgc tatatatttt agccatttga tcaattgagt taatttctgc 6260 
aatgacaaag atataccatc atccagtaca aatttattat gaagataccg accattctgg 6320 
tgttgtttac caccctaact ttttaaaata ctttgaacgt gcacgtgagc atgtgataaa 6 80 
tagtgactta ctagcaacac tgtggaatga acgcggttta ggttttgcgg tgtataaag^ 4 
caltatgact tttcaggatg gggtcgaatt tgctgaagtg tgtgatattc gcacttcttt 6500 
tgtcctagac ggtaagtaca aaacgatctg gcgccaagaa gtatggcgtc cgaatgcgac 6560 
tagggctgcc gttatcggtg atattgaaat ggtgtgctta gacaaacaaa aacgtttaca 6620 
gccc!tccct gatgatgtgt tagctgcaat ggt.agtgaa taaatggttc atgcataaat 8 
Igttaataca tgattctggc ccgtcacgtt tacagataag aggcatccga tgcctccttc 6740 
cLttaccaa tactactgct tatccctttc taactatctt tagcgtccat aacacactga 16800 
gcatttattc tattaatcag tgattgtgat ttaatta.ct tctatatatg taatttaatg 6860 
taattttcaa tttattttta gctacattaa ggcttacgaa tgtacgctaa aatgagatgt 16920 
cagactaatt ttagcttatt aa.ctgttag ccgtttatat tttataaaga tgggatttaa 16980 
cttaaatgca attaattatg gcgtaaatag agtgaaaaca tggctaatat tcactaagtc 7040 
ctgaatttta tataaagttt aatctgttat tttagcgttt acctggtctt atcagtgagg 7100 
tttatagcca ttattagtgg gattgaagtg acttttaaag ctatgtatat ^attgcaaat 60 
ataaattgta acaattaaga ctttggacac ttgagttcaa tttcgaattg attggcataa 7220 
aattrtaaaac agctaaatct acctcaatca ttttagcaaa tgtatgcagg ^agatttttt 7280 
tcgccattta agagtacact tgtacgctag gtttttgttt agtgtgcaaa tgaacgtttt 7340 
gatgagcatt gtttttagag cacaaaatag atccttacag gagcaataac gcaatggcta 00 
aaa!g!acac cacatcgatt aagcacgcca aggatgtgtt aagtagtgat gatcaacagt 7460 
taaattctcg cttgcaagaa tgtccgattg ccatcattgg tatggcatcg 9"tttgcag 20 
atgctaaaaa cttggatcaa ttctgggata acatcgttga ctctgtggac gctattattg 7580 
atgtgcctag cgatcgctgg aacattgacg accattactc ggctgataaa aaagcagctg 7640 
acaagacata ctgcaaacgc ggtggtttca ttccagagct tgattttgat ccgatggagt 7700 
ttggtttacc gccaaatatc ctcgagttaa ctgacatcgc tcaattgttg tcattaattg 7760 
ttgctcgtga tgtattaagt gatgctggca ttggtagtga ttatgacoat gataaaattg 7820 
gtatcacgct gggtgtcggt ggtggtcaga aacaaatttc gccattaacg tcgcgcctac 7880 
aaggcccggt attagaaaaa gtattaaaag cctcaggcat tgatgaagat gatcgcgcta 17940 
tgatcatcga caaatttaaa aaagcctaca tcggctggga agagaactca ttcccaggca 18000 
tgctaggtaa cgttattgct ggtcgtatcg ccaatcgttt tgattttggt ggtactaact 15060 
gtgtggttga tgcggcatgc gctggctccc ttgcagctgt taaaatggcg atc.cagact 20 
tacttgaata tcgttcagaa gtcatgatat cgggtggtgt atgttgtgat aactcgccat 8180 
tcatgtatat gtcattctcg aaaacaccag catttaccac caatgatgat atccgtccgt 18240 
ttgatgacga ttcaaaaggc atgctggttg gtgaaggtat tggcatgatg gcgtttaaac 18300 
gtcttgaaga tgctgaacgt gacggcgaca aaatttattc tgtactgaaa ggtatcggta 18360 
catcttcaga tggtcgtttc aaatctattt acgctccacg cccagatggc caagcaaaag 18420 
cgctaaaacg tgcttatgaa gatgccggtt ttgcccctga aacatgtggt ctaattgaag 18480 
gccatggtac gggtaccaaa gcgggtgatg ccgcagaatt tgctggcttg accaaacact 18S40 
ttggcgccgc cagtgatgaa aagcaatata tcgccttagg ctcagttaaa tcgcaaattg 8600 
gtcatactaa atctgcggct ggctctgcgg gtatgattaa ggcggcatta gcgctgcatc 8660 
ataaaatctt acctgcaacg atccatatcg ataaaccaag tgaagccttg gatatcaaaa 18720 
acaqcccgtt atacctaaac agcgaaacgc gtccttggat gccacgtgaa gatggtattc 18780 
cacgtcgtgc aggtatcagc tcatttggtt ttggcggcac caacttccat attattttag 18840 
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aagagtatcg cccaggtcac gatagcgcat atcgcttaaa ctcagtgagc caaactgtgt 18900 
tgatctcggc aaacgaccaa caaggtattg ttgctgagtt aaataactgg cgtactaaac 18960 
tggctgtcga tgctgatcat caagggtttg tatttaatga gttagtgaca acgtggccat 19020 
taaaaacccc atccgttaac caagctcgtt taggttttgt tgcgcgtaat gcaaatgaag 19080 
cgatcgcgat gattgatacg gcattgaaac aattcaatgc gaacgcagat aaaatgacat 19140 
ggtcagtacc taccggggtt tactatcgtc aagccggtat tgatgcaaca ggtaaagtgg 19200 
ttgcgctatt ctcagggcaa ggttcgcaat acgtgaacat gggtcgtgaa ttaacctgta 19260 
acttcccaag catgatgcac agtgctgcgg cgatggataa agagttcagt gccgctggtt 19320 
taggccagtt atctgcagtt actttcccta tccctgttta tacggatgcc gagcgtaagc 19380 
tacaagaaga gcaattacgt ttaacgcaac atgcgcaacc agcgattggt agtttgagtg 19440 
ttggtctgtt caaaacgttt aagcaagcag gttttaaagc tgattttgct gccggtcata 19500 
gtttcggtga gttaaccgca ttatgggctg ccgatgtatt gagcgaaagc gattacatga 19560 
tgttagcgcg tagtcgtggt caagcaatgg ctgcgccaga gcaacaagat tttgatgcag 19620 
gtaagatggc cgctgttgtt ggtgatccaa agcaagtcgc tgtgatcatt gatacccttg 19680 
atgatgtctc tattgctaac ttcaactcga ataaccaagt tgttattgct ggtactacgg 19740 
agcaggttgc tgtagcggtt acaaccttag gtaatgctgg tttcaaagtt gtgccactgc 19800 
cggtatctgc tgcgttccat acacctttag ttcgtcacgc gcaaaaacca tttgctaaag 19860 
cggttgatag cgctaaattt aaagcgccaa gcattccagt gtttgctaat ggcacaggct 19920 
tggtgcattc aagcaaaccg aatgacatta agaaaaacct gaaaaaccac atgctggaat 19980 
ctgttcattt caatcaagaa attgacaaca tctatgctga tggtggccgc gtatttatcg 20040 
aatttggtcc aaagaatgta ttaactaaat tggttgaaaa cattctcact gaaaaatctg 20100 
atgtrgactgc tatcgcggtt aatgctaatc ctaaacaacc tgcggacgta caaatgcgcc 20160 
aagctgcgct gcaaatggca gtgcttggtg tcgcattaga caatattgac ccgtacgacg 20220 
ccgttaagcg tccacttgtt gcgccgaaag catcaccaat gttgatgaag ttatctgcag 20280 
cgtcttatgt tagtccgaaa acgaagaaag cgtttgctga tgcattgact gatggctgga 20340 
ctgttaagca agcgaaagct gtacctgctg ttgtgtcaca accacaagtg attgaaaaga 20400 
tcgttgaagt tgaaaagata gttgaacgca ttgtcgaagt agagcgtatt gtcgaagtag 20460 
aaaaaatcgt ctacgttaat gctgacggtt cgcttatatc gcaaaataat caagacgtta 20520 
acagcgctgt tgttagcaac gtgactaata gctcagtgac tcatagcagt gatgctgacc 20580 
ttgttgcctc tattgaacgc agtgttggtc aatttgttgc acaccaacag caattattaa 20640 
atgtacatga acagtttatg caaggtccac aagactacgc gaaaacagtg cagaacgtac 20700 
ttgctgcgca gacgagcaat gaattaccgg aaagtttaga ccgtacattg tctatgtata 20760 
acgagttcca atcagaaacg ctacgtgtac atgaaacgta cctgaacaat cagacgagca 20820 
acatgaacac catgcttact ggtgctgaag ctgatgtgct agcaacccca ataactcagg 2088O 
tagtgaatac agccgttgcc actagtcaca aggtagttgc tccagttatt gctaatacag 20940 
tgacgaatgt tgtatctagt gtcagtaata acgcggcggt tgcagtgcaa actgtggcat 21000 
tagcgcctac gcaagaaatc gctccaacag tcgctactac gccagcaccc gcattggttg 21060 
ctatcgtggc tgaacctgtg attgttgcgc atgttgctac agaagttgca ccaattacac 21120 
catcagttac accagttgtc gcaactcaag cggctatcga tgtagcaact attaacaaag 21180 
taatgttaga agttgttgct gataaaaccg gttatccaac ggatatgctg gaactgagca 21240 
tggacatgga agctgactta ggtatcgact caatcaaacg tgttgagata ttaggcgcag 21300 
tacaggaatt gatccctgac ttacctgaac ttaatcctga agatcttgct gagctacgca 21360 
cgcttggtga gattgtcgat tacatgaatt caaaagccca ggctgtagct cctacaacag 21420 
tacctgtaac aagtgcacct gtttcgcctg catctgctgg tattgattta gcccacatcc 21480 
aaaacgtaat gttagaagtg gttgcagaca aaaccggtta cccaacagac atgctagaac 21540 
tgagcatgga tatggaagct gacttaggta ttgattcaat caagcgtgtg gaaatcttag 21600 
gtgcagtaca ggagatcata actgatttac ctgagctaaa ccctgaagat cttgctgaat 21660 
tacgcaccct aggtgaaatc gttagttaca tgcaaagcaa agcgccagtc gctgaaagtg 21720 
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cgccagtggc gacggctcct gtagcaacaa gctcagcacc gtctatcgat ttgaaccaca 21780 
ttcaaacagt gatgatggat gtagttgcag ataagactgg ttatccaact gacatgctag 21840 
aacttggcat ggacatggaa gctgatttag gtatcgattc aatcaaacgt gtggaaatat 21900 
taggcgcagt gcaggagatc atcactgatt tacctgagct aaacccagaa gacctcgctg 21960 
aattacgcac gctaggtgaa atcgttagtt acatgcaaag caaagcgcca gtcgctgaga 22020 
gtgcgccagt agcgacggct tctgtagcaa caagctctgc accgtctatc gatttaaacc 22080 
atatccaaac agtgatgatg gaagtggttg cagacaaaac cggttatcca gtagacatgt 22140 
tagaacttgc tatggacatg gaagctgacc taggtatcga ttcaatcaag cgtgtagaaa 22200 
ttttaggtgc ggtacaggaa atcattactg acttacctga gcttaaccct gaagatcttg 22260 
ctgaactacg tacattaggt gaaatcgtta gttacatgca aagcaaagcg cccgtagctg 22320 
aagcgcctgc agtacctgtt gcagtagaaa gtgcacctac tagtgtaaca agctcagcac 22380 
cgtctatcga tttagaccac atccaaaatg taatgatgga tgttgttgct gataagactg 22440 
gttatcctgc caatatgctt gaattagcaa tggacatgga agccgacctt ggtattgatt 225O0 
caatcaagcg tgttgaaatt ctaggcgcgg tacaggagat cattactgat ttacctgaac 22560 
taaacccaga agacttagct gaactacgta cgttagaaga aattgtaacc tacatgcaaa 22620 
gcaaggcgag tggtgttact gtaaatgtag tggctagccc tgaaaataat gctgtatcag 22680 
atgcatttat gcaaagcaat gtggcgacta tcacagcggc cgcagaacat aaggcggaat 22-740 
ttaaaccggc gccgagcgca accgttgcta tctctcgtct aagctctatc agtaaaataa 22800 
gccaagattg taaaggtgct aacgccttaa tcgtagctga tggcactgat aatgctgtgt 22860 
tacttgcaga ccacctattg caaactggct ggaatgtaac tgcattgcaa ccaacttggg 22920 
tagctgtaac aacgacgaaa gcatttaata agtcagtgaa cctggtgact ttaaatggcg 22980 
ttg^gaaac tgaaatcaac aacattatta ctgctaacgc acaattggat gcagttatct 23040 
atctgcacgc aagtagcgaa attaatgcta tcgaataccc acaagcatct aagcaaggcc 23100 
tgatgttagc cttcttatta gcgaaattga gtaaagtaac tcaagccgct aaagtgcgtg 23160 
gcgcctttat gattgttact cagcagggtg gttcattagg ttttgatgat atcgattctg 23220 
ctacaagtca tgatgtgaaa acagacctag tacaaagcgg cttaaacggt ttagttaaga 23280 
cactgtctca cgagtgggat aacgtattct gtcgtgcggt tgatattgct tcgtcattaa 23340 
cggctgaaca agttgcaagc cttgttagtg atgaactact tgatgctaac actgtattaa 23400 
cagaagtggg ttatcaacaa gctggtaaag gccttgaacg tatcacgtta actggtgtgg 23460 
ctactgacag ctatgcatta acagctggca ataacatcga tgctaactcg gtatttttag 23520 
tgagtggtgg cgcaaaaggt gtaactgcac attgtgttgc tcgtatagct aaagaatatc 23580 
agtctaagtt catcttattg ggacgttcaa cgttctcaag tgacgaaccg agctgggcaa 23640 
gtggtattac tgatgaagcg gcgttaaaga aagcagcgat gcagtctttg attacagcag 23700 
gtgataaacc aacacccgtt aagatcgtac agctaatcaa accaatccaa gctaatcgtg 23760 
aaattgcgca aaccttgtct gcaattaccg ctgctggtgg ccaagctgaa tatgtttctg 23820 
cagatgtaac taatgcagca agcgtacaaa tggcagtcgc tccagctatc gctaagttcg 23880 
gtgcaatcac tggcatcatt catggcgcgg gtgtgttagc tgaccaattc attgagcaaa 23940 
aaacactgag tgattttgag tctgtttaca gcactaaaat tgacggtttg ttatcgctac 24000 
tatcagtcac tgaagcaagc aacatcaagc aattggtatt gttctcgtca gcggctggtt 24060 
tctacggtaa ccccggccag tctgattact cgattgccaa tgagatctta aataaaaccg 24120 
cataccgctt taaatcattg cacccacaag ctcaagtatt gagctttaac tggggtcctt 24180 
gggacggtgg catggtaacg cctgagctta aacgtatgtt tgaccaacgt ggtgtttaca 24240 
ttattccact tgatgcaggt gcacagttat tgctgaatga actagccgct aatgataacc 24300 
gttgtccaca aatcctcgtg ggtaatgact tatctaaaga tgctagctct gatcaaaagt 24360 
ctgatgaaaa gagtactgct gtaaaaaagc cacaagttag tcgtttatca gatgctttag 24420 
taactaaaag tatcaaagcg actaacagta gctctttatc aaacaagact agtgctttat 24480 
cagacagtag tgcttttcag gttaacgaaa accacttttt agctgaccac atgatcaaag 24540 
gcaatcaggt attaccaacg gtatgcgcga ttgcttggat gagtgatgca gcaaaagcga 24600 
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cttatagtaa ccgagactgc gcattgaagt atgtcggttt cgaagactat aaattgttta 24660 
aaggtgtggt ttttgatggc aatgaggcgg cggattacca aatccaattg tcgcctgtga 24720 
caagggcgtc agaacaggat tctgaagtcc gtattgccgc aaagatcttt agcctgaaaa 24780 
gtgacggtaa acctgtgttt cattatgcag cgacaatatt gttagcaact cagccactta 24840 
atgctgtgaa ggtagaactt ccgacattga cagaaagtgt tgatagcaac aataaagtaa 24900 
ctgatgaagc acaagcgtta tacagcaatg gcaccttgtt ccacggtgaa agtctgcagg 24960 
gcattaagca gatattaagt tgtgacgaca agggcctgct attggcttgt cagataaccg 25020 
atgttgcaac agctaagcag ggatccttcc cgttagctga caacaatatc tttgccaatg^ 25080 
atttggttta tcaggctatg ttggtctggg tgcgcaaaca atttggttta ggtagcttac 25140 
cttcggtgac aacggcttgg actgtgtatc gtgaagtggt tgtagatgaa gtattttatc 25200 
tgcaacttaa tgttgttgag catgatctat tgggttcacg cggcagtaaa gcccgttgtg 25260 
atattcaatt gattgctgct gatatgcaat tacttgccga agtgaaatca gcgcaagtca 25320 
gtgtcagtga cattttgaac gatatgtcat gatcgagtaa ataataacga taggcgtcat 25380 
ggtgagcatg gcgtctgctt tcttcatttt ttaacattaa caatattaat agctaaacgc 25440 
ggttgcttta aaccaagtaa acaagtgctt ttagctatta ctattccaaa caggatatta 25500 
aagagaatat gacggaatta gctgttattg gtatggatgc taaatttagc ggacaagaca 25560 
atattgaccg tgtggaacgc gctttctatg aaggtgctta tgtaggtaat gttagccgcg 25620 
ttagtaccga atctaatgtt attagcaatg gcgaagaaca agttattact gccatgacag 25680 
ttcttaactc tgtcagtcta ctagcgcaaa cgaatcagtt aaatatagct gatatcgcgg 25740 
tgttgctgat tgctgatgta aaaagtgctg atgatcagct tgtagtccaa attgcatcag 25800 
caattgaaaa acagtgtgcg agttgtgttg ttattgctga tttaggccaa gcattaaatc 25860 
aagtagctga tttagttaat aaccaagact gtcctgtggc tgtaattggc atgaataact 25920 
cggttaattt atctcgtcat gatcttgaat ctgtaactgc aacaatcagc tttgatgaaa 25980 
cctrcaatgg ttataacaat gtagctgggt tcgcgagttt acttatcgct tcaactgcgt 26040 
ttgccaatgc taagcaatgt tatatatacg ccaacattaa gggcttcgct caatcgggcg 26100 
taaatgctca atttaacgtt ggaaacatta gcgatactgc aaagaccgca ttgcagcaag 26160 
ctagcataac tgcagagcag gttggtttgt tagaagtgtc agcagtcgct gattcggcaa 26220 
tcgcattgtc tgaaagccaa ggtttaatgt ctgcttatca tcatacgcaa actttgcata 26280 
ctgcattaag cagtgcccgt agtgtgactg gtgaaggcgg gtgtttttca caggtcgcag 26340 
gtttattgaa atgtgtaatt ggtttacatc aacgttatat tccggcgatt aaagattggc 26400 
aacaaccgag tgacaatcaa atgtcacggt ggcggaattc accattctat atgcctgtag 26460 
atgctcgacc ttggttccca catgctgatg gctctgcaca cattgccgct tatagttgtg 26520 
tgactgctga cagctattgt catattcttt tacaagaaaa cgtcttacaa gaacttgttt 26580 
tgaaagaaac agtcttgcaa gataatgact taactgaaag caagcttcag actcttgaac 26640 
aaaacaatcc agtagctgat ctgcgcacta atggttactt tgcatcgagc gagttagcat 26700 
taatcatagt acaaggtaat gacgaagcac aattacgctg tgaattagaa actattacag 26760 
ggcagttaag tactactggc ataagtacta tcagtattaa acagatcgca gcagactgtt 26820 
atgcccgtaa tgatactaac aaagcctata gcgcagtgct tattgccgag actgctgaag 26880 
agttaagcaa agaaataacc ttggcgtttg ctggtatcgc tagcgtgttt aatgaagatg 26940 
ctaaagaatg gaaaaccccg aagggcagtt attttaccgc gcagcctgca aataaacagg 27000 
ctgctaacag cacacagaat ggtgtcacct tcatgtaccc aggtattggt gctacatatg 27060 
ttggtttagg gcgtgatcta tttcatctat tcccacagat ttatcagcct gtagcggctt 27120 
tagccgatga cattggcgaa agtctaaaag atactttact taatccacgc agtattagtc 27180 
gtcatagctt taaagaactc aagcagttgg atctggacct gcgcggtaac ttagccaata 27240 
tcgctgaagc cggtgtgggt tttgcttgtg tgtttaccaa ggtatttgaa gaagtctttg 27300 
ccgttaaagc tgactttgct acaggttata gcatgggtga agtaagcatg tatgcagcac 27360 
taggctgctg gcagcaaccg ggattgatga gtgctcgcct tgcacaatcg aataccttta 27420 
atcatcaact ttgcggcgag ttaagaacac tacgtcagca ttggggcatg gatgatgtag 27480 
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ctaacggtac gttcgagcag atctgggaaa cctataccat taaggcaacg attgaacagg 27540 

tcgaaattgc ctctgcagat gaagatcgtg tgtattgcac cattatcaat acacctgata 27600 

gcttgttgtt agccggttat ccagaagcct gtcagcgagt cattaagaat ttaggtgtgc 27660 

gtgcaatggc attgaatatg gcgaacgcaa ttcacagcgc gccagcttat gccgaatacg 27720 

atcatatggt tgagctatac catatggatg ttactccacg tattaatacc aagatgtatt 27780 

caagctcatg ttatttaccg attccacaac gcagcaaagc gatttcccac agtattgcta 27840 

aatgtttgtg tgatgtggtg gatttcccac gtttggttaa taccttacat gacaaaggtg 27900 

cgcgggtatt cattgaaatg ggtccaggtc gttcgttatg tagctgggta gataagatct 27960 

tagttaatgg cgatggcgat aataaaaagc aaagccaaca tgtatctgtt cctgtgaatg 28020 

ccaaaggcac cagtgatgaa cttacttata ttcgtgcgat tgctaagtta attagtcatg 28080 

gcgtgaattt gaatttagat agcttgttta acgggtcaat cctggttaaa gcaggccata 28140 

tagcaaacac gaacaaatag tcaacatcga tatctagcgc tggtgagtta tacctcatta 28200 

gttgaaatat ggatttaaag agagtaatta tggaaaatat tgcagtagta ggtat-tgcta 28260 

atttgttccc gggctcacaa gcaccggatc aattttggca gcaattgctt gaacaacaag 2 8320 

attgccgcag taaggcgacc gctgttcaaa tgggcgttga tcctgctaaa tataccgcca 28380 

acaaaggtga cacagataaa ttttactgtg tgcacggcgg ttacatcagt gatttcaatt 28440 

ttgatgcttc aggttatcaa ctcgataatg attatttagc cggtttagat gaccttaatc 28500 

aatgggggct ttatgttacg aaacaagccc ttaccgatgc gggttattgg ggcagtactg 2 8560 

cactagaaaa ctgtggtgtg attttaggta atttgtcatt cccaactaaa tcatctaatc 28620 

agctgtttat gcctttgtat catcaagttg ttgataatgc cttaaaggcg gtattacatc 28680 

ctgattttca attaacgcat tacacagcac cgaaaaaaac acatgctgac aatgcattag 2 8740 

tagoaggtta tccagctgca ttgatcgcgc aagcggcggg tcttggtggt tcacattttg 28800 

cactggatgc ggcttgtgct tcatcttgtt atagcgttaa gttagcgtgt gattacctgc 2 88 60 

atacgggtaa agccaacatg atgcttgctg gtgcggtatc tgcagcagat cctatgttcg 28920 

taaatatggg tttctcgata ttccaagctt acccagctaa caatgtacat gccccgtttg 28 980 

accaaaattc acaaggtcta tttgccggtg aaggcgcggg catgatggta ttgaaacgtc 29040 

aaagtgatgc agtacgtgat ggtgatcata tttacgccat tattaaaggc ggcgcattat 29100 

cgaatgacgg taaaggcgag tttgtattaa gcccgaacac caagggccaa gtattagtat 2 9160 

atgaacgtgc ttatgccgat gcagatgttg acccgagtac agttgactat attgaatgtc 2 9220 

atgcaacggg cacacctaag ggtgacaatg ttgaattgcg ttcgatggaa acctttttca 2 9280 

gtcgcgtaaa taacaaacca ttactgggct cggttaaatc taaccttggt catttgttaa 29340 

ctgccgctgg tatgcctggc atgaccaaag ctatgttagc gctaggtaaa ggtcttattc 29400 

ctgcaacgat taacttaaag caaccactgc aatctaaaaa cggttacttt actggcgagc 29460 

aaatgccaac gacgactgtg tcttggccaa caactccggg tgccaaggca gataaaccgc 2 9520 

gtaccgcagg tgtgagcgta tttggttttg gtggcagcaa cgcccatttg gtattacaac 25580 

agccaacgca aacactcgag actaatttta gtgttgctaa accacgtgag cctttggcta 29640 

ttattggtat ggacagccat tttggtagtg ccagtaattt agcgcagttc aaaaccttat 29700 

taaataataa tcaaaatacc ttccgtgaat taccagaaca acgctggaaa ggcarggaaa 2 9760 

gtaacgctaa cgtcatgcag tcgttacaat tacgcaaagc gcctaaaggc agttacgttg 29820 

aacagctaga tattgatttc ttgcgtttta aagtaccgcc taatgaaaaa gattgcttga 2 9880 

tcccgcaaca gttaatgatg atgcaagtgg cagacaatgc tgcgaaagac ggaggtctag 29940 

ttgaaggtcg taatgttgcg gtattagtag cgatgggcat ggaactggaa ttacatcagt 30000 

atcgtggtcg cgttaatcta accacccaaa ttgaagacag cttattacag caaggtatta 30060 

acctgactgt tgagcaacgt gaagaactga ccaatattgc taaagacggt gttgcctcgg 30120 

ctgcacagct aaatcagtat acgagtttca ttggtaatat tatggcgtca cgtatttcgg 30180 

cgttatggga tttttctggt cctgctatta ccgtatcggc tgaagaaaac tctgtttatc 30240 

gttgtgttga attagctgaa aatctatttc aaaccagtga tgttgaagcc gttattattg 30300 

ctgctgttga tttgtctggt tcaattgaaa acattacttt acgtcagcac tacggtccag 30360 
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ttaatgaaaa gggatctgta agtgaatgtg gtccggttaa tgaaagcagt tcagtaacca 30420 
acaatattct tgatcagcaa caatggctgg tgggtgaagg cgcagcggct attgtcgtta 30480 
aaccgtcatc gcaagtcact gctgagcaag tttatgcgcg tattgatgcg gtgagttttg 30540 
cccctggtag caatgcgaaa gcaattacga ttgcagcgga taaagcatta acacrtgctg 30600 
gtatcagtgc tgctgatgta gctagtgttg aagcacatgc aagtggtttt agtgccgaaa 30660 
ataatgctga aaaaaccgcg ttaccgactt tatacccaag cgcaagtatc agttcggtga 30720 
aagccaatat tggtcatacg tttaatgcct cgggtatggc gagtattatt aaaacggcgc 30780 
tgctgttaga tcagaatacg agtcaagatc agaaaagcaa acatattgct attaacggtc 30840 
taggtcgtga taacagctgc gcgcatctta tcttatcgag ttcagcgcaa gcgcatcaag 30900 
ttgcaccagc gcctgtatct ggtatggcca agcaacgccc acagttagtt aaaaccatca 30960 
aactcggtgg tcagttaatt agcaacgcga ttgttaacag tgcgagttca tctttacacg 31020 
ctattaaagc gcagtttgcc ggtaagcact taaacaaagt taaccagcca gtgacgatgg 31080 
ataacctgaa gccccaaggt attagcgctc atgcaaccaa tgagtatgtg gtgactggag 31140 
ctgctaacac tcaagcttct aacattcaag catctcatgt tcaagcgtca agtcatgcac 31200 
aagagatagc accaaaccaa gttcaaaata tgcaagctac agcagccgct gtaagttcac 31260 
ccctttctca acatcaacac acagcgcagc ccgtagcggc accgagcgtt gttggagtga 31320 
ctgtgaaaca taaagcaagt aaccaaattc atcagcaagc gtctacgcat aaagcatttt 31380 
tagaaagtcg tttagctgca cagaaaaacc tatcgcaact tgttgaattg caaaccaagc 31440 
tgtcaatcca aactggtagt gacaatacat ctaacaatac tgcgtcaaca agcaatacag 31500 
tgctaacaaa tcctgtatca gcaacgccat taacacttgt gtctaatgcg cctgtagtag 31560 
cgacaaacct aaccagtaca gaagcaaaag cgcaagcagc tgctacacaa gctggttttc 31620 
agataaaagg acctgttggt tacaactatc caccgctgca gttaattgaa cgttataata 31680 
aaccagaaaa cgtgatttac gatcaagctg atttggttga attcgctgaa ggtgatattg 31740 
gta'aggtatt tggtgctgaa tacaatatta ttgatggcta ttcgcgtcgt gtacgtctgc 31800 
caacctcaga ttacttgtta gtaacacgtg ttactgaact tgatgccaag gtgcatgaat 318 60 
acaagaaatc atacatgtgt actgaatatg atgtgcctgt tgatgcaccg ttcttaattg 31920 
atggtcagat cccttggtct gttgccgtcg aatcaggcca gtgtgatttg atgttgattt 31980 
catatatcgg tattgatttc caagcgaaag gcgaacgtgt ttaccgttta cttgattgtg 32040 
aattaacttt ccttgaagag atggcttttg gtggcgatac tttacgttac gagatccaca 32100 
ttgattcgta tgcacgtaac ggcgagcaat tattattctt cttccattac gattgttacg 32160 
taggggataa gaaggtactt atcatgcgta atggttgtgc tggtttcttt actgacgaag 32220 
aactttctga tggtaaaggc gttattcata acgacaaaga caaagctgag tttagcaatg 32280 
ctgttaaatc atcattcacg ccgttattac aacataaccg tggtcaatac gattataacg 32340 
acatgatgaa gttggttaat ggtgatgttg ccagttgttt tggtccgcaa tatgatcaag 32400 
gtggccgtaa tccatcattg aaattctcgt ctgagaagtt cttgatgatt gaacgtatta 32460 
ccaagataga cccaaccggt ggtcattggg gactaggcct gttagaaggt cagaaagatt 32520 
tagaccctga gcattggtat ttcccttgtc actttaaagg tgatcaagta atggctggtt 32 580 
cgttgatgtc ggaaggttgt ggccaaatgg cgatgttctt catgctgtct cttggtatgc 32640 
ataccaatgt gaacaacgct cgtttccaac cactaccagg tgaatcacaa acggtacgtt 32700 
gtcgtgggca agtactgcca cagcgcaata ccttaactta ccgtatggaa gttactgcga 32760 
tgggtatgca tccacagcca ttcatgaaag ctaatattga tattttgctt gacggtaaag 32820 
tggttgttga tttcaaaaac ttgagcgtga tgatcagcga acaagatgag cattcagatt 32880 
accctgtaac actgccgagt aatgtggcgc ttaaagcgat tactgcacct gttgcgtcag 32 940 
tagcaccagc atcttcaccc gctaacagcg cggatctaga cgaacgtggt gttgaaccgt 33000 
ttaagtttcc tgaacgtccg ttaatgcgtg ttgagtcaga cttgtctgca ccgaaaagca 33060 
aaggtgtgac accgattaag cattttgaag cgcctgctgt tgctggtcat catagagtgc 33120 
ctaaccaagc accgtttaca ccttggcata tgtttgagtt tgcgacgggt aatatttcta 33180 
actgtttcgg tcctgatttt gatgtttatg aaggtcgtat tccacctcgt acaccttgtg 33240 
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gcgatttaca agttgttact caggttgtag aagtgcaggg cgaacgtctt gatcttaaaa 33300 
atccatcaag ctgtgtagct gaatactatg taccggaaga cgcttggtac tttactaaaa 33360 
acagccatga aaactggatg ccttattcat taaccatgga aattgcattg caaccaaatg 33420 
gctttatttc tggttacatg ggcacgacgc ttaaataccc tgaaaaagaL ctgttcttcc 33480 
gtaaccttga tggtagcggc acgttattaa agcagattga tttacgcggc aagaccattg 33540 
tgaataaatc agtcttggtt agtacggcta ttgctggtgg cgcgattatt caaagtttca 33600 
cgtttgatat gtctgtagat ggcgagctat tttatactgg taaagctgta tttggttact 33660 
ttagtggtga atcactgact aaccaactgg gcattgataa cggtaaaacg actaatgcgt 33720 
ggtttgttga taacaatacc cccgcagcga atattgatgt gtttgattta actaatcagt 33780 
cattggctct gtataaagcg cctgtggata aaccgcatta taaattggct ggtggtcaga 33840 
tgaactttat cgatacagtg tcagtggttg aaggcggtgg taaagcgggc gtggcttatg 33900 
tttatggcga acgtacgatt gatgctgatg attggttctt ccgttatcac ttccaccaag 33960 
atccggtgat gccaggttca ttaggtgttg aagctattat tgagttgatg cagacctatg 34020 
cgcttaaaaa tgatttgggt ggcaagtttg ctaacccacg tttcattgcg ccgatgacgc 34080 
aagttgattg gaaataccgt gggcaaatta cgccgctgaa taaacagatg tcactggacg 34140 
tgcatatcac tgagatcgtg aatgacgctg gtgaagtgcg aatcgttggt gatgcgaatc 34200 
tgtctaaaga tggtctgcgt atttatgaag ttaaaaacat cgttttaagt attgttgaag 342 60 
cgtaaagggt caagtgtaac gtgcttaagc gccgcattgg ttaaagacgc tttgcacgcc 34320 
gtgaatccgt ccatggaggc ttggggttgg catccatgcc aacaacagca agcttacttt 34380 
aatcaatacg gcttggtgtc catttagacg cctcgaactt agtagttaat agacaaaata 344 40 
atttagctgt ggaatgaata tagtaagtaa tcattcggca gctacaaaaa aggaattaag 34500 
aatgtcgagt ttaggtttta acaataacaa cgcaattaac tgggcttgga aagtagatcc 34560 
agcgtcagtt catacacaag atgcagaaat taaagcagct ttaatggatc taactaaacc 34 620 
tctctatgtg gcgaataatt caggcgtaac tggtatagct aatcatacgt cagtagcagg 34 680 
tgcgatcagc aataacatcg atgttgatgt attggcgttt gcgcaaaagt taaacccaga 347 40 
agatctgggt gatgatgctt acaagaaaca gcacggcgtt aaatatgctt atcatggcgg 34800 
tgcgatggca aatggtattg cctcggttga attggttgtt gcgttaggta aagcagggct 348 60 
gttatgttca tttggtgctg caggtctagt gcctgatgcg gttgaagatg caattcgtcg 34920 
tattcaagct gaattaccaa atggccctta tgcggttaac ttgatccatg caccagcaga 34980 
agaagcatta gagcgtggcg cggttgaacg tttcctaaaa cttggcgtca agacggtaga 35040 
ggcttcagct taccttggtt taactgaaca cattgtttgg tatcgtgctg ctggtctaac 35100 
taaaaacgca gatggcagtg ttaatatcgg taacaaggtt atcgctaaag tatcgcgtac 35160 
cgaagttggt cgccgcttta tggaacctgc accgcaaaaa ttactggata agttattaga 35220 
acaaaataag atcacccctg aacaagctgc tttagcgttg cttgtaccta tggctgatga 35280 
tattactggg gaagcggatt ctggtggtca tacagataac cgtccgtttt taacattatt 35340 
accgacgatt attggtctgc gtgatgaagt gcaagcgaag tataacttct ctcctgcatt 354 00 
acgtgttggt gctggtggtg gtatcggaac gcctgaagca gcactcgctg catttaacat 354 60 
gggcgcggct tatatcgttc tgggttctgt gaatcaggcg tgtgttgaag cgggtgcatc 35520 
tgaatatact cgtaaactgt tatcgacagt tgaaatggct gatgtgacta tggcacctgc 35580 
tgcagatatg tttgaaatgg gtgtgaagct gcaagtatta aaacgcggtt ctatgttcgc 35640 
gatgcgtgcg aagaaactgt atgacttgta tgtggcttat gactcgattg aagatatccc 35700 
agctgctgaa cgtgagaaga ttgaaaaaca aatcttccgt gcaaacctag acgagatttg 357 60 
ggatggcact atcgctttct ttactgaacg cgatccagaa atgctagccc gtgcaacgag 35820 
tagtcctaaa cgtaaaatgg cacttatctt ccgttggtat cttggccttt cttcacgctg 35880 
gtcaaacaca ggcgagaagg gacgtgaaat ggattatcag atttgggcag gcccaagttt 35940 
aggtgcattc aacagctggg tgaaaggttc ttaccttgaa gactataccc gccgtggcgc 36000 
tgtagatgtt gctttgcata tgcttaaagg tgctgcgtat ttacaacgtg taaaccagtt 36060 
gaaattgcaa ggtgttagct taagcacaga attggcaagt tatcgtacga gtgattaatg 36120 
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ttacLtgatg atatgtgaac taattaaagc gcctgagggc gctttttttg gtttttaact 36180 
caggtgttgt aactcgaaat tgcccctttc aagttagacc gattactcac tcacaatatg 36240 
ttgatatcgc acttgccata tacttgctca tccaaagccc tatattgata atggtgttaa 36300 
tagtctttaa tatccgagtc tttcttcagc ataatactaa tatagagact cgaccaatgt 36360 
taaacacaac aaagaatata ttcttgtgta ctgccttatt attaacgagt gcgagtacga 36420 
cagctactac gctaaacaat tcgatatcag caattgaaca acgtatttct ggtcgtatcg 36480 
gtgtggctgt tttagatacg caaaataaac aaacgtgggc ttacaatggt gatgcacatt 36540 
ttccgatgat gagtacattc aaaaccctcg cttgcgcgaa aatgctaagt gaatcgacaa 36600 
atggtaatct ggatcccagt actagctcat tgataaaggc tgaagaatta atcccttggt 36660 
caccagtcac taaaacgttt gtgaataaca ctattacagt ggcgaaagcg tgtgaagcaa 36720 
caatgctgac cagtgataat accgcggcta atattgtttt acagtatatc ggaggccctc 36780 
aaggcgttac tgcattcttg cgagaaattg gtgatgaaga gagtcagtta gatcgtatag 36840 
aacctgaatt gaatgaagct aaggtcggag acttgcgtga taccacgaca ccgaaagcca 36900 
tagttaccac gctcaacaaa ctactacttg gtgatgttct acttgatttg gataaaaacc 36960 
aacttaaaac atggatgcaa aataataaag tgtcagatcc tttactgcgt tctatattac 37020 
cgcaaggctg gtttattgcc gaccgctcag gtgcgggtgg taatggttct cgaggtataa 37080 
ctgctatgct ttggcactcc gagcgtcaac cgctaatcat cagtatttat ttaaccgaaa 37140 
ctgagttagc aatggcaatg cgcaatgaga ttattgttga gatcggtaag ctgatattca 37200 
aagaatacgc ggtgaaataa taagttattt tttgataata ctttaacgag cgtagctatc 37260 
gaagtgaggg cgtcaattag acacctttgc ttcccctaca aaatctaatg tgtattacct 37320 
cggctagtac aattgcccta agttatttct gtccagcttt ggcttagtgc aattgcgtta 37380 
gccaatgtga acaccaaggg actttgtcgt accataacta ccaagcgact ttgtcgtttt 37440 
tatcttttct tagacaaaca gaggttaaat gagtgacgcc ttccaaatca caggaatgaa 37500 
tccgcatttc aataaaatct aacccgtacc aactccgtac aagttgatct ttagttgttt 37560 
aaaatctata ataaattcaa ttacggaatt aatccgtaca actggaggtt ttatggctac 37620 
tgcaagactt gatatccgtt tggatgaaga aatcaaagct aaggctgaga aagcatcagc 37680 
tttactcggc ttaaaaagtt taaccgaata cgttgttcgc ttaatggacg aagattcaac 37740 
taaagtagtt tctgagcatg agagtattac cgttgaagcg aatgtattcg accaatttat 37800 
ggctgcttgt gatgaagcga aagccccaaa taaagcatta cttgaagccg ctgtatttac 37860 
tcagaatggt gagtttaagt gagttattcc aaacgtttca aagaactgga taaatcaaaa 37920 
catgacagag catcatttga ctgtggcgaa aaagagctaa atgattttat ccaaactcaa 37980 
gcagccaaac atatgcaagc aggtattagc cgcactctgg ttttacctgc ttctgcgccg 38040 
ttaccaaaca aaaaatatcc aatttgctca ttttatagta tcgcgccaag ctcaattagc 38100 
cgcgatacgt taccacaagc aatggctaaa aagttaccac gttatcctat ccctgttttt 38160 
cttttggctc aacttgccgt ccataaagag tttcatggga gtgggttagg caaagttagc 38220 
ttaattaaag cgttagagta cctttgggaa attaactctc acatgagagc ttacgccatc 38280 
gttgttgatt gtttaactga acaagctgag tcattctacg ctaaatatgg tttcgacgtt 38340 
ctctgcgaaa taaatggtcg agtaagaatg ttcatatcaa tgaaaacagt caatcagtta 38400 
ttcacttaac agtaagagtt agtataacag ttgtatgaat taaatttatt atattcggta 38460 
atctcattgc gatcacgcta gaagtgcgag cgggtcagac cgaggccaca atagcagccg 38520 
ttacgtttag gggatgactt aaaaagataa ctactacgtc agtggcgatc ctagaggatt 38580 
aaaggtttat gattcacaac atttatttat tgtgcttaat tttttctatc caatatgcgc 38640 
aagctgtaaa tatcactgaa gtagactttt atgtcagtga tgatatccct aaagatgttg 38700 
ccaaattaaa gataggtgaa tccataacga actccagcct tattctaagt aactcatcta 38760 
ttccactctc gcgggagacg ggtaacatat attactcttc atcaattgct aacttgaact 38820 
atgactcgat agaatttgtt atggctcaat tgatggccga agattccagc ctttacaaga 38880 
tgctggtaaa tagcgatagg ttgtccgtgc tagtaatgac atcttcccag tccacagatc 38940 
tctatggctc gacttactcg gcttattttc ctaatgttgc ggtcatcgat ttgaattgtg 39000 
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actcgctaac tttagaacat gagctcggcc atctatacgg agctgaacat gaagaaatat 39060 
atgacgacta tgccttctat gctgcgatat gtggagacta tacgactatc atgaactcta 39120 
tgcagcctga aatgaaagaa aaacaaatga taaaggcata ttcattccct gaattaaaag 39180 
tggatggctt gcagtgcgga aatgaaaata cgaataacaa aaaggttatt ttagacaata 39240 
ttggtcggtt tagataggat tgggatatta ttctcattcg gctctactta gtgctgttat 39300 
tatgagtgcc agtgcttcta tctacgatat tggtcttaac aagtatttat ctatagacgc 39360 
taaggtgtta tgtatttaag ggatgttcaa gatgaaacta ggtgtaaacg atgtatagtt 39420 
gtataacatt ttttcaacgg ttggaacgtt cgattctatc gggtaacaag accgcgacga 39480 
tccgcgataa gtccgatagt cattacttag ttggtcagat gttagatgct tgtactcacg 39540 
aagataatcg gaaaatgtgt caaatagaaa tactgagcat tgaatatgtg acgtttagtg 39600 
aattaaaccg tgcgcacgcc aatgctgaag gtttaccgtt tttgtttatg cttaagtgga 39660 
tagttcgaaa gatttatccg acttcaaacg atttattttt cataagtttc agagttgtaa 39720 
ctatcgatat cttataagtc ttagtgcaca aaacagaact atttatagcg ctcaagaagg 39780 
cgataatttg ataatgaatt atcgccttgt tadtattaag agactttaaa tgactgagat 39840 
ataagatatg acacggaaga acatattgat cacaggcgca agttcagggt tgggccgagg 39900 
tatggccatc gaatttgcaa aatcaggtca taacttagca ctttgtgcac gtagacttga 39960 
taatttagtt gcactgaaag cagaactctt agccctcaat cctcacatcc aaatcgaaat 40020 
aaaacctctt gatgtcaatg aacatgaaca agtcttcact gttttccatg aattcaaagc 40080 
tgaatttggt acgcttgatc gtattattgt taatgctgga ttaggcaagg gtggatcc 40138 



<210> 13 
<211> 19227 
<212> DNA 

<213> Vibrio marinus 



<400> 13 

aaatgcaatt aattatggcg taaatagagt 
aattttatat aaagtttaat ctgttatttt 
atagccatta ttagtgggat tgaagtgatt 
aattgtaaca attaagactt tggacacttg 
ttaaaacagc taaatctacc tcaatcattt 
ccatttaaga gtacacttgt acgctaggtt 
gagcattgtt tttagagcac aaaatagatc 
agaacaccac atcgattaag cacgccaagg 
attctcgctt gcaagaatgt ccgattgcca 
ctaaaaactt ggatcaattc tgggataaca 
tgcctagcga tcgctggaac attgacgacc 
agacatactg caaacgcggt ggtttcattc 
gtttaccgcc aaatatcctc gagttaactg 
ctcgtgatgt attaagtgat gctggcattg 
tcacgctggg tgtcggtggt ggtcagaaac 
gcccggtatt agaaaaagta ttaaaagcct 
tcatcgacaa atttaaaaaa gcctacatcg 
taggtaacgt tattgctggt cgtatcgcca 
tggttgatgc ggcatgcgct ggctcccttg 
ttgaatatcg ttcagaagtc atgatatcgg 
tgtatatgtc attctcgaaa acaccagcat 
atgacgattc aaaaggcatg ctggttggtg 



gaaaacatgg ctaatattca ctaagtcctg 60 
agcgtttacc tggtcttatc agtgaggttt 120 
tttaaagcta tgtatattat tgcaaatata 180 
agttcaattt cgaattgatt ggcataaaat 240 
tagcaaatgt atgcaggtag atttttttcg 300 
tttgtttagt gtgcaaatga acgttttgat 360 
ctracaggag caataacgca atggctaaaa 420 
atgtgttaag tagtgatgat caacagttaa 480 
tcattggtat ggcatcggtt tttgcagatg 540 
tcgttgactc tgtggacgct attattgatg 600 
attactcggc tgataaaaaa gcagctgaca 660 
cagagcttga ttttgatccg atggagtttg 720 
acatcgctca attgttgtca ttaattgttg 780 
gtagtgatta tgaccatgat aaaattggta 840 
aaatttcgcc attaacgtcg cgcctacaag 900 
caggcattga tgaagatgat cgcgctatga 960 
gctgggaaga gaactcattc ccaggcatgc 1020 
atcgttttga ttttggtggt actaactgtg 1080 
cagctgttaa aatggcgatc tcagacttac 1140 
gtggtgtatg ttgtgataac tcgccattca 1200 
ttaccaccaa tgatgatatc cgtccgtttg 1260 
aaggtattgg catgatggcg tttaaacgtc 1320 
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ctgaagacgc cgaacgt,a<: ,,cgacaaaa ctc.ccctgt actg.aaggt a.c^tacac 1380 
cc ca.a.gg tcgtxccaaa .cta.tcacg ctccacgccc agatggccaa gcaaaagcgc 4 0 
ta..ac...c ttatgaagat gccggttttg cccccg aac atgtggrcra attgaaggcc SOO 
atggtacggg taccaaagcg ggtgacgccg cagaacttgc tggcttgacc aaacacttcg S60 
gcgLgccag tgatgaaaag caacacatcg ccct.ggctc agttaaa.cg caaat.ggtc e 0 
Lactaaatc tgcggctggc cccgcgggta tga.taaggc ggcattagcg c.gcatcata U 0 
aaatcctacc tgcaacgatc catarcgata aaccaagtga agccttggat atcaaaaaca 1710 
gcccgttata cctaaacagc gaaacgcgcc cttggargcc acgtgaagar ggtatcccac 800 
gtcgtgcagg catcagctca ttcggc«t, gcggcaccaa ctccatatt attttagaag 18 0 
agtaccgccc aggtcacgat agcgcatatc gctcaaactc agtgagccaa actg.gt ga 1920 
tltcggcaaa cgaccaacaa ggtattgttg c.gagttaaa taactggcg. ac.aaactgg 9 0 
"g^cgatgc tgatcatcaa gggtctg.a. .caacgagtt agtgacaacg tggccattaa 2 0 
aaaccLatc cgt.aaccaa gctcgt«.g gtt.tgtrgc gcgtaatgca -^^-^-^^ ^^OO 
tcgcgatgat tgatacggca ttgaaacaat tcaarg.gaa cgcaga.aaa "J-^tjg. 2 0 
cagtlcctac cgggg^tac tatcgtcaag ccggta.tga tgcaacaggt aaagtggttg 2220 
cgctat.ctc agggcaaggt tcgcaatacg tgaaca.ggg tcgtgaatta acc.g.aact .2 0 
tcccaagcat gatgcacagt gctgcggcga tgga.aaaga gt.cagtgcc g.««g 2340 
,cca,tta.c tgcagrtact ttccctarcc ctgtttatac ggacgccgag cgtaagctac 2400 
lagalgagca attacgttta acgcaacacg cgcaaccagc ga«ggtagt «gag.gttg z 60 
gcccgttcaa aacgrrtaag caagcaggtt ttaaagci:ga tcctgccgcc ggtcatagct 2520 
t^ggrgagrt: aacfgcatta cgggc.gccg atgcacngag cgaaagcga. tacatgacgt 2 80 
.agc/cgLg tcg.ggtcaa g«a.gg.c.g cgcagagca acaagatt.t .-9-g9- a 
a44!cg! tgctgttggt. gatccaaagc .ag.cgcrg. gat:cattgac -cc.gat, 2.00 
aJg^Sctlt tgct:aac.tc a.ctcgaata a.«.g«g. .a..gc.ggt -^-^^ag^ 27^0 
a^gttgctg. agcg,t:t:aca aecttagg.a a.g=t:gg.« «aag«grg cca.^gccgg 2820 
tatcgcgc /.Cccataca cc««g..tc gt:cacgcgca aaaaccattt '-""^^^'^^80 
ttga.agcg= taaa.«aaa gcg.caagca t.ccagtgtt: .gc.aatggc f 
tgLctcaag caaaccgaat gacat:taaga aaaacc.gaa aaaccacarg "«-CCt^ 3°00 
.La^.tcaa ccaagaaatt gacaacacct acg.tgatgg cggocgcgta 

ttggtccaaa gaatgcat.a ac.aaaccgg ttgaaaacat t.i:=^=ng.. aaatctgatg 3120 
tglccgca. cgcgg..aat gctaa.ccta aacaacccgc ggacgtacaa -tgcgccaag 31 0 
.Igcgctgca aacggcagtg ct.ggtgtcg ca«agacaa ta.cgacccg -^cgacgccg 240 
ttlagcgtcc actrgrtgcg ccgaaagcat caccaatgtt gatgaagtta tctgcagcgr 3300 
cttatgctag tccgaaaacg aagaaagcgt «g«ga.gc attgaccga. ggccggactg 336 
ttaa^!aagc gaaagctgca cctgccgttg tgtcacaacc acaagtgatt gaaaagatcg 3420 
,:tgaagttga aaagacagtt gaacgcartg tcgaagtaga .gcgtattgtc ^-^tagaaa 3480 
aaaccgtcta cgttaatact gacggttcgc ttatatcgca aaataatcaa gacgttaaca 3S40 
gcgctgttgc tagcaac^g acraaragct cagtgactca tagcagtgat gccgaccttg 3600 
ttgcctccac tgaacgcagt gttg.gtcaat tcgtcgcaca ccaacagcaa ttac.aaatg 36 
tacacgaaca gtttatgcaa ggtccacaag actacgcgaa aacagtgcag aacgtacti;g 3720 
ctgcgcagac gagcaatgaa ccaccggaaa gtttagaccg tacattgtct acgtataacg 3780 
agtcccaatc agaaacgcta cgtgtacatg aaacgtacct gaacaatcag acgagcaaca 3840 
tgaacaccat gcttactggt gctgaagctg argtgccagc aaccccaata actcaggtag 3900 
,:gaatacagc cgttgccact agtc.caagg tagttgcccc agctattgct aatacagtga 3960 
cgaatgttgt atctagtgcc agtaataacg cggcggttgc agtgcaaacc gtggcatcag 020 
cgcctacgca agaaatcgct ccaacagtcg ctac.:acgcc agcacccgca ttggttgcta 80 
tcgtggcga acccgcgat. gt.gcgcatg ctgctacaga agttgcacca atcacaccat 40 
cagttacacc agccgtcgca actcaagcgg ctatcgatgt agcaactatt aacaaagcaa 4200 
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tgttagaagt tgttgctgat aaaaccggtt atccaacgga tacqctggaa ctgagcatgg 4260 
acatggaagc tgactcaggt atcgactcaa tcaaacgcgt tgagataCta ggcgcagtac 4320 
aggaattgac ccctgaccca cctgaacttia accctgaaga tcttgctgag ctacgcacgc 4380 
ttggtgagat tgtcgattac atgaattcaa aagcccaggc tgtagctcct acaacagtac 4440 
ctgtaacaag tgcacctgtt tcgcctgcat ctgctggtat tgatttagcc cacatccaaa 4500 
acgtaacgtt agaagtggtt gcagacaaaa ccggtcaccc aacagacatg ctagaactga 4560 
gcacggatat ggaagctgac Ctaggtattg attcaatcaa gcgtgtggaa atcttaggtg 4620 
cagtacagga gatca^aact gatttacctg agctaaaccc Cgaagatctt gctgaatrac 4680 
gcaccccagg Cgaaatcgtt agttacatgc aaagcaaagc gccagtcgct gaaagcgcgc 4740 
cagtggcgac ggctcctgta gcaacaagct cagcaccgtc tatcgatttg aaccacattc 4BO0 
aaacagcgat. gaCggargta gttgcagata agactggtta tccaactgac atigctagaac 48 60 
tt:ggcatgga catggaagct gatttaggta tcgattcaat caaacgtgtg gaaatattag 4920 
gcgcagrgca ggagatcatc actgattcac ctgagcraaa cccagaagac ctcgctgaat 4980 
tacgcacgcc aggtgaaacc gttagtcaca tgcaaagcaa agcgccagtc gctgagagtg 5O40 
cgccagcagc gacggcctct gtagcaacaa gctcrgcacc gtctatcgat ttaaaccaCa SlOO 
tccaaacagt gacgacggaa gtggttgcag acaaaaccgg ttatccagta gacatgt^ag SI 60 
aac^tgc^at ggacatggaa gcrgacctag gtaccgattc aatcaagcgt gtiagaaattt 5220 
taggtgcggt acaggaaatc attactgacr tacctgagct naaccctgaa gatcttgctg S280 
aaccacgt:ac actaggtgaa atcgttagtt acabge^aag caaagcgccc gCagccgaag $340 
cgcct:gcagc acct:gttgca gtagaaagtg cacccactag tgtaacaagc ccagcaccgt 5400 
ctatcgac.tr agaccacatc caaaatgcaa tgatggatgt rgttgctgar aagactggtic 54 60 
atcctgccaa tatgcttgaa ttagcaatgg acatggaagc cgaccttggt attgatrcaa 552p 
tcaagcgtgc rgaaatlcta ggcgcggtac aggagatcac tactgartta cctgaactaa SS80 
acccagaaga cttagctgaa ctacgtacgt tagaagaaat tgtaacctac atgcaaagca SfSAO 
aggcgagrgg tgttactgca aatgtagtgg ctagccctga aaataatgct gtatcagatg S700 
catttatgcra aagcaatgtg gcgactarca cagcggccgc agaacataag gcggaattta 57.60 
aaccggcgcc gagcgcaacrc gttgctatcr ctcgtctaag ctctaccagt. aaaataagcc SB20 
aagattgtaa aggtgctaac gccttaatcg tagctgatgg cactgataat gctgtgttac 58 80 
tugcagacca cctattgcaa actggctgga atgtaactgc attgcaacca acttgggtag S940 
ccgtaacaac gacgaaagca trtaataagt cagtgaacct ggtg&cttta aatggcrgttg 6000 
atgaaaccga aatcaacaac attattaccg cLaacgcaca actggatgca gttatctatic 6060 
tgcacgcaag cagcgaaatt aacgctatcg aatacccaca agcatctaag caaggcctga 6^20 
tgttagcctt cttattagcg aaattgagta aagcaactca agccgctaaa gcgcgtggcg 6180 
ccttcatgat tgttactcag cagggtggtt cattaggctt tgatgacatc gattctgcta 6240 
caagtcarga tgtgaaaaca gacctagtac aaagcggctt aaacggttta gttaagacac 6300 
tgtctcacga gtgggataac gtattctgtc gtgcggttga nattgcttcg tcattaacgg 6360 
crgaacaagt tgcaagccct gttagtgatg aactacttga tgctaacact gtattaacag 6420 
aagtgggtta tcaacaagct ggtaaaggcc ttgaacgtat cacgttaact ggtgtggcta 6480 
ctgacagcta tgcattaaca gctggcaata acatcgatgc taactcggta tttttagcga 65^0 
gtggtggcgc aaaaggtgta accgcacatt gtgttgctcg tatagctaaa gaatatcagt 6600 
ctaagttcat cttattggga cgttcaacgt tctcaagtga cgaaccgagc tgggcaagtg 6660 
gtattactga tgaagcggcg ttaaagaaag cagcgatgca gtctttgatt acagcaggtg 6720 
ataaaccaac acccgctaag atcgtacagc taatcaa'acc aatccaagct aatcgtgaaa 6780 
ttgcgcaaac cttgtctgca attaccgctg ctggtggcca agctgaatat gcccctgcag 6B40 
atgtaactaa tgcagcaagc gtacaaatgg cagtcgctcc agctatcgct aagctcggtg 6900 
caatcactgg catcattcat ggcgcgggCg tgtcagctga ccaaLtcatt gagcaaaaaa 6960 
cactgagtga tttrgagtcc gtttacagca ctaaaattga cggtCCgtta ccgctactat 7020 
cagtcactga agcaagcaac atcaagcaat cggtattgct cccgtcagcg gctggtttct 7080 
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acggtaaccc cggccagtct gattactcga ttgccaatga gatcttaaa. -aaccgcat .140 
accgctttaa atcattgcac ccacaagctc aagtattgag ctttaactgg 99 ccttggg 7 00 
acggtggcat ggtaacgcct gagcttaaac gtatgtttga ccaacgtggt ^tttacat 0 
ttccacttga tgcaggtgca cagttattgc tgaatgaact agccgctaat ^ataaccgtt .0 
gtccacaaat cctcgtgggt aatgacttat ctaaagatgc tagctctgat "aaagtctg 7 
atgaaaagag tactgctgta, aaaaagccac aagttagtcg tttatcagat ^ctttagtaa 7 0 
ctaaaagtat caaagcgact aacagtagct ctttatcaaa caagactagt gctttatcag 7 00 
acagtagtgc ttttcaggtt aacgaaaacc actttttagc tgaccacatg -caaaggca 
atcaggtatt accaacggta tgcgcgattg cttggatgag tgatgcagca aaagcgactt 76 0 
atagtLccg agactgtgca ttgaagtatg tcggtttcga agactataaa ttgtttaaag 7 0 
g g ggttJ tgatggcaat gaggcggcgg attaccaaat ccaattgtcg cctgtgacaa 
gggcg!caga acaggattct gaagtccgta ttgccgcaaa gatctttagc ctgaaaagtg 7 0 
acggtaaacc tgtgtttcat tatgcagcga caatattgtt agcaactcag — "-tg 8 0 
ctgtgaaggt agaacttccg acattgacag aaagtgttga tagcaacaat aaagtaactg 7920 
a gaagcaca agogttatac .agcaatggca ccttgttcca cggtgaaagt ctgcagggca 
ttaagcagat attaagttgt gacgacaagg gcctgctatt ggcttgtcag ataaccgatg 0 
ttgcaacagc taagcaggga tccttcccgt tagctgacaa caatatcttt ^ccaatgat 
tggtttatca ggctatgttg gtctgggtgc gcaaacaatt tggtttaggt agcttacctt 81 0 
c/g^gacaac ggc«ggact g.gtatcgtg aagtggttgt agatgaagta ttttatctg 
aacttaatgt tgctgagcat gatctattgg gttcacgcgg cagtaaagcc cg^tgtgata 82 0 
ttcaattgat tgctgctgat atgcaattac ttgccgaagt gaaatcagcg caagtcagtg 0 
tcagtgacat tttgaacgat atgtcatgat cgagtaaata ataacgatag gcgtcatggt 8400 
gagcatggcg tctgctttct tcatttttta acattaacaa tattaatagc taaacgcggt 0 
tgc-tttaaac caagtaaaca agtgctttta gctattacta ttccaaacag gatattaaag 80 
agaatatgac ggaattagct gttattggta tggatgctaa att.agcgga caagacaata 80 
ttgaccgt:gt ggaacgcgct ttctatgaag gtgcttatgt aggtaatgtt agccgcgtta 8640 
gtaccgaatc taatgttatt agcaatggcg aagaacaagt tattactgcc atgacagttc 8700 
ttaactctgt cagtctacta gcgcaaacga atcagttaaa tatagctgat atcgcggtgt 8760 
tgctgattgc tgatgtaaaa agtgctgatg atcagcttgt agtccaaatt ^"tcagcaa 8 20 
ttgaaaaaca gtgtgcgagt tgtgttgtta ttgctgattt aggccaagca t-aatcaag 0 
tagctgattt agttaataac caagactgtc ctgtggctgt aattggcatg aataactcgg 8940 
ttaatttatc tcgtcatgat cttgaa.ctg taactgcaac aatcagcttt 9-gaaacct 0 
tcaatggtta taacaatgta gctgggttcg cgagtttact tatcgcttca actgcgtttg 90 0 
ccaatgctaa gcaatgttat atatacgcca acattaaggg cttcgctcaa tcgggcgtaa 9 20 
atgctcaatt taacgttgga aacattagcg atactgcaaa gaccgcattg cagcaagcta ? 0 
gcataactgc agagcaggtt ggtttgctag aagtgtcagc agtcgctgat tcggcaatcg 9240 
cattgtctga aagccaaggt ttaatgtctg cttatcatca tacgcaaact ttgcatactg 9 00 
cattaagcag tgcccgtagt gtgactggtg aaggcgggtg tttttcacag gtcgcaggtt 9360 
tattgaaatg tgtaattggt ttacatcaac gttatattcc ggcgattaaa gattggcaac 9 20 
aaccgagtga caatcaaatg tcacggtggc ggaattcacc attctatatg cctgtagatg 9480 
ctcgaccttg gttcccacat gctgatggct ctgcacacat tgccgcttat agttgtgtga 9 40 
ctgctgacag ctattgtcat attcttttac aagaaaacgt cttacaagaa cttgttttga 9600 
aagaaacagt cttgcaagat aatgacttaa ctgaaagcaa gcttcagact cttgaacaaa 9660 
acaatccagt agctgatctg cgcactaatg gttactttgc atcgagcgag ttagcattaa 9720 
tcatagtaca aggtaatgac gaagcacaat tacgctgtga attagaaact attacagggc 9780 
agttaagtac tactggcata agtactatca gtattaaaca gatcgcagca gactgttatg 9840 
cccgtaatga tactaacaaa gcctatagcg cagtgcttat tgccgagact gctgaagagt 9 00 
taagcaaaaa aataaccttg gcgtttgctg gtatcgctag cgtgtttaat gaagatgcta 9960 
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aagaatggaa aaccccgaag ggcagttatt ttaccgcgca gcctgcaaat aaacaggctg 10020 
cJacagcac acagaacggt g.caccccca .g.acccagg Cattgg.gct acatatgtcg 
gtttagggcg tgatctattt catctattcc cacagattta tcagcctgta gcggctttag 01.0 
ccgatgacat tggcgaaagt ctaaaagata ctttacttaa tccacgcagt attagtcgtc 
atagctttaa agaactcaag cagttggatc tggacctgcg cggtaactta gccaatatcg 0260 
ctgaagccgg tgtgggt^tt gcttgtgtgt ttaccaaggt atttgaagaa gtctttgccg 0320 
ttaaagctga ctttgccaca ggttatagca tgggtgaagt aagcatgtat gcagcactag 10380 
gctgctggca gcaaccggga ttgatgagtg ctcgccttgc acaatcgaat acctttaatc 10440 
atcaactttg cggcgagtta agaacactac gtcagcattg gggcatggat gatgtagcta 10500 
acggtacgtt cgagcagatc tgggaaacct ataccattaa ggcaacgatt gaacaggtcg 10560 
aaattgcctc tgcagatgaa gatcgtgtgt attgcaccat tatcaataca cctgatagct 10620 
tgttgttagc cggttatcca gaagcctgtc agcgagtcat taagaattta ggtgtgcgtg 10680 
caatggcatt gaatatggcg aacgcaattc acagcgcgcc agcttatgcc gaatacgatc 1O740 
atatggttga gctataccat atggatgtta ctccacgtat taataccaag atgtattcaa 10800 
gctcatgtta tttaccgatt ccacaacgca gcaaagcgat ttcccacagt attgctaaat 108 60 
gtttgtgtga tgtggtggat ttcccacgtt tggttaatac cttacatgac aaaggtgcgc 10920 
gggtattcat tgaaatgggt ccaggtcgtt cgttatgtag ctgggtagat aagatcttag 10980 
ttaatggcga tggcgataat aaaaagcaaa gccaacatgt atctgttcct gtgaatgcca 11040 
aaggcaccag tgatzgaactt acttatattc gtgcgattgc taagttaatt agtcatggcg 11100 
tgaatttgaa tttagatagc ttgtttaacg ggtcaatcct ggttaaagca ggccatatag 11160 
caaacacgaa caaatagtca acatcgatat ctagcgctgg tgagttatac ctcattagtt 11220 
gaiatatgga tttaaagaga gtaattatgg aaaatattgc agtagtaggt attgctaatt 11280 
tgttcccggg ctcacaagca ccggatcaat tttggcagca attgcttgaa caacaagatt 11340 
gccgcagtaa ggcgaccgct gttcaaatgg gcgttgatcc tgctaaatat accgccaaca 11400 
aaggtgacac agataaattt tactgtgtgc acggcggtta catcagtgat ttcaattttg 11460 
atgcttcagg ttatcaactc gataatgatt atttagccgg tttagatgac cttaatcaat 11520 
gggggcttta tgttacgaaa caagccctta ccgatgcggg ttattggggc agtactgcac 11580 
tagaaaactg tggtgtgatt ttaggtaatt tgtcattccc aactaaatca tctaatcagc 11640 
tgtttatgcc tttgtatcat caagttgttg ataatgcctt aaaggcggta ttacatcctg 11700 
attttcaatt aacgcattac acagcaccga aaaaaacaca tgctgacaat gcattagtag 11760 
caggttatcc agctgcattg atcgcgcaag cggcgggtct tggtggttca cattttgcac 11820 
tggatgcggc ttgtgcttca tcttgttata gcgttaagtt agcgtgtgat tacctgcata 11880 
cgggtaaagc caacatgatg cttgctggtg cggtatctgc agcagatcct atgttcgtaa 11940 
atatgggttt ctcgatattc caagcttacc cagctaacaa tgtacatgcc ccgtttgacc 12000 
aaaattcaca aggtctattt gccggtgaag gcgcgggcat gatggtattg aaacgtcaaa 12060 
gtgatgcagt acgtgatggt gatcatattt acgccattat taaaggcggc gcattatcga 12120 
atgacggtaa aggcgagttt gtattaagcc cgaacaccaa gggccaagta ttagtatatg 12180 
aacgtgctta tgccgatgca gatgttgacc cgagtacagt tgactatatt gaatgtcatg 12240 
caacgggcac acctaagggt gacaatgttg aattgcgttc gatggaaacc tttttcagtc 12300 
gcgtaaataa caaaccatta ctgggctcgg ttaaatctaa ccttggtcat ttgttaactg 12360 
ccgctggtat gcctggcatg accaaagcta tgttagcgct aggtaaaggt cttattcctg 12420 
caacgattaa cttaaagcaa ccactgcaat ctaaaaacgg ttactttact ggcgagcaaa 12480 
tgccaacgac gactgtgtct tggccaacaa ctccgggtgc caaggcagat aaaccgcgta 12540 
ccgcaggtgt gagcgtattt ggttttggtg gcagcaacgc ccatttggta ttacaacagc 12600 
caacgcaaac actcgagact aattttagtg ttgctaaacc acgtgagcct ttggctatta 12660 
ttggtatgga cagccatttt ggtagtgcca gtaatttagc gcagttcaaa accttattaa 12720 
ataataatca aaataccttc cgtgaattac cagaacaacg ctggaaaggc atggaaagta 12780 
acgctaacgt catgcagtcg ttacaattac gcaaagcgcc taaaggcagt tacgttgaac 12840 
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agccagatat tgatttcttg cgtttta^ag taccgcctaa tgaaaaagat tgcttgatcc 1290O 
cgcaacagtt aatgatgacg caagcggc^g acaatgctgc gaaagacgga ggtctagttg 12360 
aaggtcgtaa tgCtgcggta tcagtagcga tgggcatgga actggaatta caccagtatc 13020 
gtggtcgcgt taatctaacc acccaaatcg aagacagcct attacagcaa ggtattaacc 1308 0 
tgactgttga gcaacgtgaa gaactgacca atattgctaa agacggtgtc gccccggctg 1314 0 
cacagctaaa tcagtatacg agtttcattg gtaatactac ggcgtcacgt atctcggcgt 13200 
tatgggattt ttctggtcct gccatcaccg taccggctga agaaaaccct gtttatcgtt 13250 
gtgttgaatt agctgaaaat ctatttcaaa ccagtgatgt tgaagccgtt attattgctg 1332 0 
ctgttgattt: gtctggctca artgaaaaca teacttcacg tcagcactac ggtccagtta 133B0 
atgaaaaggg atctgtaagt gaatgtggtc cggtcaatga aagcagtrca gtaaccaaca 13440 
atatrcttga tcagcaacaa tggctggtgg gtgaaggcgc agcggctatt gtcgttaaac 13500 
cgtcatcgca agtcactgct gagcaagrtt acgcgcgtat tgatgcggcg agtttcgccc 13550 
ctggtagcaa tgcgaaagca attacgartg cagcggataa agcactaaca cttgctggta 13 2 0 
tcagtgctgc tgargragct agtgtcgaag cacatgcaag tgg«tcagt gccgaaaaca 13 80 
atgctgaaaa aaccgcgtta ccgacrrtat acccaagcgc aagtaccagt tcggcgaaag 13140 
ecaatattgg tcaracgttt aargcctcgg gtacggcgag nattatcaaa acggcgctgc 13800 
cgttagatca gaatacgagt caagatcaga aaagcaaaca tattgctatt aacggtctag 13860 
gtcgtgataa- cagot-gegcg caCcttatct tatcgagttc agcgcaagcg catc^agtrg 13920 
caccagcgcc tgtatctggt atggccaagc aacgcccaca gttagttaaa accateaaac 13980 
rcggtggtca grt^ttagc aacgcgarr, ttaacagcgc gagttcarct ttacacgcta 1404O 
rtaaagcgca gt^gccggt: aagcaczi:taa acaaag«aa ccagocagtg atgarggata 141O0 
acctgaagcc cca^ggrati: agcgctcatg caaccaatga gtatgtggtg actggagctg 14 60 
ctaacactca agcrrccaac attcaagcat ctcatgttca agcgtcaagc catgcacaag 14220 
agatagcacc aaaccaagrt' caaaatatgc aagctacagc agccgctgta agttcacccc 14280 
Cttcccaaca ccaac:*paca gcgcagcocg tagcggcacc gagcgttgtt ggagtgacCg 14340 
tgaaacataa agc«.»gtaac caaattcatc agcaagcgT:c tacgcacaaa gc^tttttag 14400 
aaagcegttt agetgcacag .aaaaacctat cgcaacttgt tgaactgcaa accaagctgt 14460 
caatccaaac ,:ggcagi:gaa aatacatcta acaatactgc gtcaacaagc aatacagtgc 14520 
taacaaatcc cgratcagca acgccatcaa cacttgtgtc taacgcgcct gtagtagcga 14580 
caaacetaac cagCacagaa gcaaaagcgc aagcagctgc tacacaagct ggtttceaga 14640 
taaaaggacc t:gttggtt:ac aactatccac cgctgcagtt aactgaacgt tataacaaac 147O0 
cagaaaacgt gacctacgar caagrtgatt tggttgaatt cgctgaaggt gatattggta 14760 
aggtatttgg cgccgaatac aatattactg atggctattc gcgtcgtgta cgtctgccaa 14820 
cctcagatta cttgtcagta acacgtgcta ctgaacttga tgccaaggtg catgaataca 14B80 
agaaatcara catgtgtact gaatatgatg tgctcgctga tgcaccgrtc ttaattgacg 14940 
gtcagatccc ttggtctgtt gccgtcgaac cagg«agtg tgatctgatg ttgatttcat ISOOO 
atatcggtac cgacccccaa gcgaaaggcg aacgtgttra ccgtttacct gattgtgaat 15060 
taactttcct tgaagagatg gcttttggtg gcgatacctt acgttacgag atccacattg 15120 
attcgtatgc acgtaacggc gagcaatcat tattcttctt ccactacgat tgttacgtag 15180 
gggataagaa ggcacttatc acgcgtaaCg gttgtgctgg tttctttact gacgaagaac 15240 
Cttctgatgg taaaggcgtt attcataacg acaaagacaa agctgagttt agcaatgctg 1S300 
Ctaaatcacc attcacgccg ttattacaac ataaccgtgg tcaatacgat tataacgaca 15360 
tgatgaagct ggtcaatggt gatgttgcca gttgttttgg tccgcaatat gatcaaggtg 15420 
gccgtaatcc atcattgaaa ctctcgtctg agaagttctt gatgattgaa cgcattacca 15490 
agatagaccc aaccggtggt cattggggac caggcctgtt agaaggteag aaagatttag 15540 
accccgagca ttggtatttc ccttgtcact ttaaaggtga tcaagtaatg gctggttcgt 15600 
tgatgccgga aggtcgtggc caaatggcqa tgttcttc»t gctgtctctt ggtatgcata 15660 
ccaatgcgaa caacgctcgt ttccaaccac taccaggtga atcacaaacg gtacgttgtc 15720 
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gtgggcaagt accgccacag cgcaacacct taacttaccg tatggaagtt accgcgacgg 15780 
gtatgcatcc acagccattc atgaaagcta atattgatat tttgcttgac ggtaaagtgg 15840 
ttgttgattt caaaaacttg agcgtgatga tcagcgaaca agacgagcat tcagattacc 1S900 
ctgcaacact gccgagtaat gtggcgccta aagcgattac tgcacctgtt gcgtcagtag 15960 
caccagcatc ttcacccgct aacagcgcgg atctagacga acgtggtgtt gaaccgtcta ie020 
agtttcctga acgtccgtta atgcgtgttg agtcagactt grctgcaccg aaaagcaaag 16080 
gtgtgacacc gattaagcat tttgaagcgc ctgctgttgc tggtcatcat agagtgccra 16140 
accaagca'cc gtttacacct tggcacatgt ctgagtttgc gacgggtaac atttctaact 16200 
gtttcggtcc tgactttgat gttcatgaag gtcgtattcc acctcgtaca ccttgtggcg 16260 
atttacaagt tgtcactcag gcrgtagaag tgcagggcga acgtcttgat cttaaaaatc 16320 
catcaagctg tgtagctgaa tactatgtac cggaagacgc ttggtacttt actaaaaaca 16380 
gccatgaaaa ctggatgcct tattcattaa tcatggaaat tgcattgcaa ccaaatggct 16440 
ttatttctgg ttacatgggc acgacgctta aataccctga aaaagatccg ttcttccgta 16500 
acctrgatgg tagcggcacg rractaaagc agattgattt acgcggcaag accattgtga 16560 
ataaatcagt cttggttagt acggctattg ctggtggcgc gattattcaa agtttcacgt 16620 
rtgatatgtc tgtagatggc gagctattrt atactggtaa agctgtattt ggttacttta 16660 
gtggtgaatc actgactaac caactgggca ttgataacgg taaaacgact aatgcgtggt 16740 
ttgttgataa caataccccc gcagcgaata ttgatgtgtr tgarttaact aatcagtcat 16800 
tggctctgta taaagcgcct gtggataaac cgcattataa artggctggt ggccagatga 160^0 
actttatcga tacagtgtca gtggttgaag gcggcggtaa agcgggcgrg gcttatgttt 16920 
acggcgaacg tacgattgar gctgatgatt ggttcttccg tcatcacttc caccaagatc 16980 
cggtgatgcc aggttcatta ggtgttgaag ctattattga gttgatgcag acctacgcgc 17040 
ttaaaaatiga tttgggcggc aagtttgcta acccacgttt cattgcgccg argacgcaag 17X00 
ttgattggaa at^accgTiggg caaacracgc cgctgaafcaa acagatgcca ctggacgcgc ^^^^^ 
atatcactga gatcgtgaat: gacgccggtg aagtgcgaat cgttggtgat gcgaatctgt ilZ^ 
ctaaagatgg tctgcgtatt tatgaagtta aaaacatcgt tttaagtatt gttgaagcgt 17280 
aaagggtcaa gtgtaacgtg cttaagcgcc gcattggtta aagacgctrt. geacgccgtg 17340 
aatccgtcca tggaggcttg gggtnggcat ccatgccaac aacagcaagc cractttaat 17400 
caatacggcc tggtgtccat ttagacgccn cgaacttagt agttaataga caaaataatt 17460 
tagctgtgga acgaacatag t:aagt:aatca ttcggcagct acaaaaaagg aattaagaat 17520 
gtcgagttta ggtcttaaca ataacaacgc aattaactgg gcttggaaag tagatccagc 17580 
gtcagtteat acacaagatg cagaaattaa agcagcctta atggatctaa ctaaaccrct 17640 
ctatgtggcg aataattcag gcgcaactgg tatagctaat catacgtcag tagcaggtgc 17700 
gatcagcaat aacatcgatg ttgatgtatt ggcgttcgcg caaaagctaa acccagaaga 177 60 
tctgggtgat. gatgctcaca agaaacagca cggcgttaaa tatgcttatc atggcggtgc 17020 
gatggcaaar ggtattgcct cggctgaatt ggttgttgcg ttaggtaaag cagggctgrt 17080 
atgttcattt ggtgctgcag gtctagtgcc tgatgcggtc gaagatgcaa txcgtcgtat 17940 
tcaagctgaa ttaccaaatg gcccttatgc ggttaacttg atccatgcac cagcagaaga 10000 
agcattagag cgtggcgcgg ttgaacgttt cctaaaactt ggcgtcaaga cggtagaggc 180 60 
ttcagcttac cttggtctaa ctgaacacat tgtttggtat cgcgctgctg gtctaactaa 1B120 
aaacgcagat ggcagtgcta atatcggtaa caaggttatc gcraaagtat cgcgtaccga 10180 
agttggtcgc cgctttargg aacctgcacc gcaaaaatCa ctggataagt tattagaaca 1B240 
aaataagatc acccctgaac aagctgcttt agcgttgctt gcacctatgg ctgatgat;at I83O0 
tactggggaa gcggattctg gtggtcatac agataaccgt ccgtttttaa cattattacc 1B360 
gacgattatt ggtctgcgtg atgaagtgca agcgaagtat aacttctctc ctgcattacg 18420 
tgttggcgct ggtggtggta tcggaacgcc tgaagcagca ctcgctgcat ttaacatggg 184B0 
cgcggctrat atcgttctgg gtcctgtgaa tcaggcgtgt gttgaagcgg gtgcatctga 18540 
a'tatactcgt aaactgttat cgacagttga aacggctgac gtgaccatgg cacccgctgc 19600 
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agatatgttt gaaatgggtg tgaagctgca agtattaaaa cgcggttcta tgttcgcgat 18660 
gcgtgcgaag aaactgtatg acttgtatgt ggcttatgac tcgattgaag atatcccagc 18720 
tgctgaacgt gagaagattg aaaaacaaat cttccgtgca aacctagacg agatttggga 18780 
tggcactatc gctttcttta ctgaacgcga tccagaaatg ctagcccgtg caacgagtag 18840 
tcctaaacgt aaaatggcac ttatcttccg ttggtatctt ggcctttctt cacgctggtc 18900 
aaacacaggc gagaagggac gtgaaatgga ttatcagatt tgggcaggcc caagtttagg 18960 
tgcattcaac agctgggtga aaggttctta ccttgaagac tatacccgcc gtggcgctgt 19020 
agatgttgct ttgcatatgc ttaaaggtgc tgcgtattta caacgtgtaa accagttgaa 19080 
attgcaaggt gttagcttaa gtacagaatt ggcaagttat cgtacgagtg attaatgtta 19140 
cttqatqata tgtgaattaa ttaaagcgcc tgagggcgct ttttttggtt tttaactcag 19200 

19227 

gtgttgtaac tcgaaattgc ccctttc 

<210> 14 
<211> 217 
<212> DNA 

<213> Shewanella putrefaciens 
<400> 14 

attggtaaaa ataggggtta tgtttgttgc tttaaagagt gtcctgaaaa attgctaact 60 
tctcgattga tttccttata cttctgtccg ttaacaatac aagagtgcga taaccagact 120 
acagagttgg ttaagtcatg gctgcctgaa gatgagttaa ttaaggttaa tcgctacatt 180 
aaacaagaag ctaaaactca aggtttaatg gtaagag 217 

<2l0> 15 
<211> 72 
<212> PRT 

<213> Shewanella putrefaciens 
<400> 15 

He Gly Lys Asn Arg Gly Tyr Val Cys Cys Phe Lys Glu Cys Pro Glu 
15 10 15 

Lys Leu Leu Thr Ser Arg Leu He Ser Leu Tyr Phe Cys Pro Leu Thr 
20 25 30 

He Gin Glu Cys Asp Asn Gin Thr Thr Glu Leu Val Lys Ser Trp Leu 
35 40 45 

Pro Glu Asp Glu Leu He Lys Val Asn Arg Tyr He Lys Gin Glu Ala 
50 55 60 

Lys Thr Gin Gly Leu Met Val Arg 
65 "70 



<210> 16 
<211> 885 
<212> DNA 
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<213> Shewanella putrefaciens 



<400> 16 

agcgaaatgc ttatcaagaa attccaagat 
ctggttcact gggtaacgtt atttccggcc 
tgaactgtgt cgttgatgca gcatgtgcag 
gcgagcttgt tgaaggccgc agcgaaatga 
caccaaccat gtacatgagc ttctctaaaa 
aaccattcga tattgactcg aaaggtatga 
ttaaacgtct tgaagacgca gagcgtgatg 
ttgggtgcat cttcagacgg taatttatta 
aggctaaggc acttaaacgt gcttacgacg 
tacttgaagc ccacggcaca ggcacagcag 
actctgtatt cagtgaaggc aatgacgaaa 
cacagattgg tcacactaaa tcaacagcgg 
cactgcacca taaagtactg ccgccaacaa 
atattgaaga ctcgcctttc tacctcaata 
atggtacacc gcgtcgtgct ggtattagct 



caatacatca ctgggaagaa aattcattcc 60 
gtattgctaa ccgcttcgac cttggtggca 120 
gccctcttgc tgcattgcgt atggcattaa 180 
tgattacagg tggtgtgtgt accgataact 240 
caccggcatt cacgacaaac gaaacaattc 300 
tgattggtga aggtatcggt atgattgcgc 360 
gcgaccgtat ctattccgtg attaaaggtg 420 
agagtantta tgcgcntcgt cctgaaggtc 480 
atgcaggttt cgcaccgcac acacttggct 540 
caggtgatgt ggcagaattc agtggtctta 600 
agcaacacat cgcattaggt tcagtgaaat 660 
gtactgcggg tctaatcaaa gcgtctttag 720 
tcaatgtaac cagccctaac cctaaactga 780 
cacagacgcg tccatggatg caacgtgtcg 840 
catttggttt tggtg 885 



<210> 17 
<211> 409 
<212> DNA 

<213> Shewanella putrefaciens 
<400> 17 

ccaagctaaa gcacttaacc gtgcttatga 
tctaattgaa ggccatggta cgggtaccaa 
gaccaaacac tttggcgccg ccagtgatga 
atcgcaaatt ggtcatacta aatctgcggc 
agcgctgcat cataaaatct tacctgcaac 
ggatatcaaa aacagcccgt tatacctaaa 
agatggtatt ccacgtcgtg caggtattag 



agatgccggt tttgcccctg aaacatgtgg 60 
agcgggtgat gccgcagaat ttgctggctt 120 
aaagcaatat atcgccttag gctcagttaa 180 
tggctctgcg ggtatgatta aggcggcatt 240 
gatccatatc gataaaccaa gtgaagcctt 300 
cagcgaaacg cgtccttgga tgccacgtga 360 
ctcatttggt tttggtggc 409 



<21p> 18 
<211> 81 
<212> DNA 

<213> Artificial Sequence 
<22P> 

<223> Description of Artificial Sequence: SYNTHETIC 
<400> 18 

ccaagctaaa gcacttaacc gtgcctatga tgatgccggt tttgcccctg aaacatgtgg 60 
tctaattgaa ggccatggta c 

<210> 19 
<211> 81 
<212> DNA 
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<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: SYNTHETIC 
<400> 19 

ccaagctaaa gcacttaacc gtgcttatga agatgccggt tttgcccctg aaacatgtgg 60 
tctaattgaa ggccatggta c 81 

<210> 20 
<211> 43 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: SYNTHETIC 
<400> 20 

agaacgcaaa gttgccgcac tgtttggtcg ccaaggttca caa 43 

<210> 21 
<2M> 43 
<212> DNA 

<2l3> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: SYNTHETIC 
<400> 21 

caaagcgggt gatgccgcac tgtttggtcg cttgacctaa cac 4^ 

<210> 22 
<211> 55 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: SYNTHETIC 
<400> 22 

cattgcgcta ggttcagtta aatcacaaac tggtcatact aaatcaactg caggt 51 

<210> 23 
<211> 55 
<212> DNA 

<213> Artificial Sequence 
<220> 
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<223> Description of Artificial Sequence: SYNTHETIC 

<400> 23 

tatcgcctta ggctcagtta aatcgcaaat tggtcatact aaatctgcgg ctggc 55 

<210> 24 

<211> 29 

<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: SYNTHETIC 

<400> 24 

cggcttcgat tttggcggca tgaacggtg 29 

<210> 25 

<211> 29 

<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: SYNTHETIC 



<400> 25 

cgcgtatgat taaggcggca ttagcgctg 

<210> 26 

<211> 28 

<212> DNA 

<213> Artificial Sequence 



29 



<220> 

<223> Description of Artificial Sequence: SYNTHETIC 
<400> 26 

gcactgctgc aagcatgaac gcgtcgtt 28 

<210> 27 
<211> 28 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: SYNTHETIC 
<400> 27 

gctctgcggc tatcattaac gcggcatt 28 
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<210> 28 
<211> 29 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: SYNTHETIC 
<400> 28 

tccctggtgc taaccatatc agcaaacca 

<210> 29 
<211> 29 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: SYNTHETIC 
<400> 29 

29 

tacctgcaac gatccatatc gataaacca 

<2ro> 30 

<211> 98 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: SYNTHETIC 
<400> 30 

ctcacctttg tatctaaaca ctgagacttc gtccatggtt accacgtgtt gatggtacgc 60 
cgcgccgcgc gggtattagc tcatttggtt ttggtggc 98 

<210> 31 
<211> 98 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: SYNTHETIC 
<400> 31 

cagcccgtta tacctaaaca gcgaaacggc gtccttggat gccacgtgaa gatggtattc 60 
cacgtcgtgc aggtattagc tcatttggtt ttggtggc 98 

<210> 32 
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<211> 4 
<212> PRT 

<213> Shewanella putrefaciens 

<400> 32 . 
Asp Kaa Ala Cys 
1 



<210> 33 
<211> 4 
<212> PRT 

<213> Shewanella putrefaciens 

<400> 33 
Gly Phe Gly Gly 
1 



<210> 34 
<211> 5 
<21€> PRT 

<213> Shewanella putrefaciens 
<400> 34 

Gly His Ser Xaa Gly 
1 5 



<210> 35 
<211> 6 
<212> PRT 

<213> Shewanella putrefaciens 
<400> 35 

Leu Gly Xaa Asp Ser Leu 
1 5 



<210> 36 
<211> 6 
<212> PRT 

<213> Shewanella putrefaciens 
<400> 36 

Leu Gly Xaa Asp Ser He 
1 5 
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<210> 37 
<211> 6 
<212> PRT 

<213> Shewanella putrefaciens 
<400> 37 

Gly Xaa Gly Xaa Xaa Gly 
1 5 



<210> 38 
<211> 6 
<212> PRT 

<213> Shewanella putrefaciens 
<400> 38 

Gly Xaa Gly Xaa Xaa Ala 
1 5 



<21Q> 39 
<211> 6 
<212> PRT 

<213> 'Axial Seamount ' polynoid polychaete 
<40O> 39 

Gly Xaa Gly Xaa Xaa Pro 
1 5 



<210> 40 
<211> 5 
<212> PRT 

<213> Shewanella putrefaciens 
<400> 40 

Gly Xaa Ser Xaa Gly 
1 5 



<210> 41 
<211> 35 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence; synthetic 

78 
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<400> 41 

cuacuacuac uaccaagcta aagcacttaa ccgtg 

<210> 42 
<211> 32 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: synthetic 
<400> 42 

cuacuacuac uaacagcgaa atgcttatca ag 

<210> 43 
<211> 38 
<212> DNA 

<213> Artificial Sequence 
<220> 

<226> Description of Artificial Sequence: synthetic 
<4dO> 43 

cuacuacuac uagcgaccaa aaccaaatga gctaatac 

<210> 44 
<211> 12 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: synthetic 

<400> 44 
aagcccgggc tt 

<210> 45 
<211> 20 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: synthetic 
<400> 45 

gtacaagccc gggcttagct 
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<210> 46 
<211> 56 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: synthetic 
<400> 46 

cgcgatttaa atggcgcgcc ctgcaggcgg ccgcctgcag ggcgcgccat ttaaat 56 

<210> 47 
<211> 41 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: synthetic 
<400> 47 

ctgcagctcg agacaatgtt gatttcctta tacttctgtc c 41 

<210> 48 
<211> 37 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: synthetic 
<400> 48 

ggatccagat ctctagctag tcttagctga agctcga 37 

<210> 49 
<211> 39 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: synthetic 
<400> 49 

tctagactcg agacaatgag ccagacctct aaacctaca 39 

<210> 50 

<211> 37 

<212> DNA 

<213> Artificial Sequence 
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<220> 

<223> Description of Artificial Sequence: synthetic 
<400> 50 

cccgggctcg agctaattcg cctcactgtc gtttgct 37 

<210> 51 
<211> 39 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: synthetic 
<400> 51 

gaattcctcg agacaatgcc gctgcgcatc gcacttatc 39 

<210> 52 
<211> 37 
<212> DNA 

<213> Artificial Sequence 
<22fO> 

<223> Description of Artificial Sequence: synthetic 
<400> 52 

ggtaccagat ctttagactt ccccttgaag taaatgg 37 

<210> 53 
<211> 39 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence; synthetic 
<400> 53 

gaattcgtcg acacaatgtc attaccagac aatgcttct 39 

<210> 54 
<211> 38 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: synthetic 
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<400> 54 

tctagagtcg acttatacag attcttcgat gctgatag 

<210> 55 
<211> 39 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: synthetic 
<400> 55 

gaattcgtcg acacaatgaa tcctacagca actaacgaa 39 

<210> 56 
<211> 37 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: synthetic 
<400> 56 

tctagaggat ccttaggcca ttctttggtt tggcttc 37 

<210> 57 
<211> 39 
<2X2> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: synthetic 
<400> 57 

tctagagtcg acacaatggc ggaattagct gttattggt 39. 

<210> 58 
<211> 36 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: synthetic 
<400> 58 

gtcgacggat ccctatttgt tcgtgtttgc tatatg 36 
<210> 59 
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38 
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<211> 42 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: synthetic 
<400> 59 

gtcgacggat ccacaatgaa tatagtaagt aatcattcgg ca 

<210> 60 
<211> 37 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: synthetic 
<400> 60 

gtcgacctcg agttaatcac tcgtacgata acttgcc 

<210> 61 
<211> 39 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: synthetic 
<400> 61 

cccgggtcga cacaatggct aaaaagaaca ccacatcga 

<210> 62 
<211> 40 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: synthetic 
<400> 62 

cccgggtcga ctcatgacat atcgttcaaa atgtcactga 

<210> 63 
<211> 44 
<212> DNA 

<213> Artificial Sequence 
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<220> 

<223> Description of Artificial Sequence: synthetic 



<400> 63 

tcgacatgga aaatattgca gtagtaggta ttgctaattt gttc 

<210> 64 
<211> 44 
<212> DMA 

<213> Artificial Sequence 

<220> ^ ^. 

<223> Description of Artificial Sequence: synthetic 

<400> 64 

ccgggaacaa attagcaata cctactactg caatattttc catg 

<210> 65 
<211> 21 
<212> DNA 

<213> Artificial Sequence 



<220> 

<223> Description of Artificial Sequence: synthetic 



<400> 65 

tcagatgaac tttatcgata c 

<210> 66 
<211> 36 
<212> DNA 

<213> Artificial Sequence 



<220> 

<223> Description of Artificial Sequence: synthetic 



<400> 66 

tcatgagacg tcgtcgactt acgcttcaac aatact 

<210> 67 
<211> 30 
<212> DNA 

<213> Schizochytrium aggregatum 
<400> 67 

gtgatgatct ttccctgatg cacgccaagg 
<210> 68 
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<211> 30 
<212> DNA 

<213> Schizochytrium aggregatum 
<4G0> 68 

agctcgagac cggcaacccg cagcgccaga 

<210> 69 
<211> 4446 
<212> DNA 

<213> Schizochytrium aggregatum 



<400> 69 

cgctgccgcc gcgtctcgcc gcgccgcgcc gcgccgccgc cgccgctcgc gcgcacgccc 60 
gcgcgtctcg ccgcgcctgc tgtctcgaac gagcttctcg agaaggccga gaccgtcgtc 120 
atggaggtcc tcgccgccaa gactggctac gagactgaca tgatcgagtc cgacatggag 180 
ctcgagactg agctcggcat tgactccatc aagcgtgtcg agatcctctc cgaggttcag 240 
gccatgctca acgtcgaggc caaggacgtc gacgctctca gccgcactcg cactgtgggt 300 
gaggtcgtca acgccatgaa ggctgagatc gctggtggct ctgccccggc gcctgccgcc 360 
gctgccccag gtccggctgc tgccgcccct gcgcctgctg tctcgagcga gcttctcgag 420 
aaggccgaga ctgtcgtcat ggaggtcctc gccgccaaga ctggctacga gactgacatg 480 
attgagtccg acatggagct cgagaccgag ctcggcattg actccatcaa gcgtgtcgag 540 
attctctccg aggttcaggc catgctcaac gtcgaggcca aggacgtcga cgctctcagc 600 
cgcactcgca ctgttggtga ggtcgtcgat gccatgaagg ctgagatcgc tggcagctcc 660 
gcctcggcgc ctgccgccgc tgctcctgct ccggctgctg ccgctcctgc gcccgctgcc 720 
gccgcccctg ctgtctcgaa cgagcttctc gagaaagccg agactgtcgt catggaggtc 780 
ctcgccgcca agactggcta cgagactgac atgatcgagt ccgacatgga gctcgagact 840 
gagctcggca ttgactccat caagcgtgtc gagatcctct ccgaggttca ggccatgctc 900 
aacgtcgagg ccaaggacgt cgatgccctc agccgcaccc gcactgttgg cgaggttgtc 960 
gatgccatga aggccgagat cgctggtggc tctgccccgg cgcctgccgc cgctgcccct 1020 
gctccggctg ccgccgcccc tgctgtctcg aacgagcttc ttgagaaggc cgagactgtc 1080 
gtcatggagg tcctcgccgc caagactggc tacgagaccg acatgatcga gtccgacatg 1140 
gagctcgaga ccgagctcgg cattgactcc atcaagcgtg tcgagattct ctccgaggtt 1200 
caggccatgc tcaacgtcga ggccaaggac gtcgatgctc tcagccgcac tcgcactgtt 1260 
ggcgaggtcg tcgatgccat gaaggctgag atcgccggca gctccgcccc ggcgcctgcc 1320 
gccgctgctc ctgctccggc tgctgccgct cctgcgcccg ctgccgctgc ccctgctgtc 1380 
tcgagcgagc ttctcgagaa ggccgagacc gtcgtcatgg aggtcctcgc cgccaagact 1440 
ggctacgaga ctgacatgat tgagtccgac atggagctcg agactgagct cggcattgac 1500 
tccatcaagc gtgtcgagat cctctccgag gttcaggcca tgctcaacgt cgaggccaag 1560 
gacgtcgatg ccctcagccg cacccgcact gttggcgagg ttgtcgatgc catgaaggcc 1620 
gagatcgctg gtggctctgc cccggcgcct gccgccgctg cccctgctcc ggctgccgcc 1680 
gcccctgctg tctcgaacga gcttcttgag aaggccgaga ccgtcgtcat ggaggtcctc 1740 
gccgccaaga ctggctacga gaccgacatg atcgagtccg acatggagct cgagaccgag 1800 
ctcggcattg actccatcaa gcgtgtcgag attctctccg aggttcaggc catgctcaac 1860 
gtcgaggcca aggacgtcga cgctctcagc cgcactcgca ctgttggcga ggtcgtcgat 1920 
gccatgaagg ctgagatcgc tggtggctct gccccggcgc ctgccgccgc tgctcctgcc 1980 
tcggctggcg ccgcgcctgc ggtcaagatt gactcggtcc acggcgctga ctgtgatgat 2040 
ctttccctga tgcacgccaa ggtggttgac atccgccgcc cggacgagct catcctggag 2100 
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cgccccgaga accgccccgt tctcgttgtc gatgacggca gcgagctcac cctcgccctg 2160 
gtccgcgtcc tcggcgcctg cgccgttgtc ctgacctttg agggtctcca gctcgctcag 2220 
cgcgctggtg ccgctgccat ccgccacgtg ctcgccaagg atctttccgc ggagagcgcc 2280 
gagaaggcca tcaaggaggc cgagcagcgc tttggcgctc tcggcggctt catctcgcag 2340 
caggcggagc gcttcgagcc cgccgaaatc ctcggcttca cgctcatgtg cgccaagttc 2400 
gccaaggctt ccctctgcac ggctgtggct ggcggccgcc cggcctttat cggtgtggcg 2460 
cgccttgacg gccgcctcgg attcacttcg cagggcactt ctgacgcgct caagcgtgcc 2520 
cagcgtggtg ccatctttgg cctctgcaag accatcggcc tcgagtggtc cgagtctgac 2580 
gtcttttccc gcggcgtgga cattgctcag ggcatgcacc ccgaggatgc cgccgtggcg 2640 
attgtgcgcg agatggcgtg cgctgacatt cgcattcgcg aggtcggcat tggcgcaaac 2700 
cagcagcgct gcacgatccg tgccgccaag ctcgagaccg gcaacccgca gcgccagatc 27 60 
gccaaggacg acgtgctgct cgtttctggc ggcgctcgcg gcatcacgcc tctttgcatc 2820 
cgggagatca cgcgccagat cgcgggcggc aagtacattc tgcttggccg cagcaaggtc 2880 
tctgcgagcg aaccggcatg gtgcgctggc atcactgacg agaaggctgt gcaaaaggct 2940 
gctacccagg agctcaagcg cgcctttagc gctggcgagg gccccaagcc cacgccccgc 3000 
gctgtcacta agcttgtggg ctctgttctt ggcgctcgcg aggtgcgcag ctctattgct 3060 
gcgattgaag cgctcggcgg caaggccatc tactcgtcgt gcgacgtgaa ctctgccgcc 3120 
gacgtggcca aggccgtgcg cgatgccgag tcccagctcg gtgcccgcgt ctcgggcatc 3180 
gttcatgcct cgggcgtgct ccgcgaccgt ctcatcgaga agaagctccc cgacgagttc 3240 
gacgccgtct ttggcaccaa ggtcaccggt ctcgagaacc tcctcgccgc cgtcgaccgc 3300 
gccaacctca agcacatggt cctcttcagc tcgctcgccg gcttccacgg caacgtcggc 3360 
cagtctgact acgccatggc caacgaggcc cttaacaaga tgggcctcga gctcgccaag 3420 
gacgtctcgg tcaagtcgat ctgcttcggt ccctgggacg gtggcatggt gacgccgcag 3480 
ctcaagaagc agttccagga gatgggcgtg cagatcatcc cccgcgaggg cggcgctgat 3540 
accgtggcgc gcatcgtgct cggctcctcg ccggctgaga tccttgtcgg caactggcgc 3600 
accccgtcca agaaggtcgg ctcggacacc atcaccctgc accgcaagat ttccgccaag 3660 
tccaacccct tcctcgagga ccacgtcatc cagggccgcc gcgtgctgcc catgacgctg 372 0 
gccattggct cgctcgcgga gacctgcctc ggcctcttcc ccggctactc gctctgggcc 378 0 
attgacgacg cccagctctt caagggtgtc actgtcgacg gcgacgtcaa ctgcgaggtg 3840 
accctcaccc cgtcgacggc gccctcgggc cgcgtcaacg tccaggccac gctcaagacc 3900 
ttttccagcg gcaagctggt cccggcctac cgcgccgtca tcgtgctctc caaccagggc 3960 
gcgcccccgg ccaacgccac catgcagccg ccctcgctcg atgccgatcc ggcgctccag 4020 
ggctccgtct acgacggcaa gacccccttc cacggcccgg ccttccgcgg catcgatgac 4080 
gtgctctcgt gcaccaagag ccagcttgtg gccaagtgca gcgctgtccc cggctccgac 4140 
gccgctcgcg gcgagtttgc cacggacact gacgcccatg accccttcgt gaacgacctg 4200 
gcctttcagg ccatgctcgt ctgggtgcgc cgcacgctcg gccaggctgc gctccccaac 42 60 
tcgatccagc gcatcgtcca gcaccgcccg gtcccgcagg acaagccctt ctacattacc 4320 
ctccgctcca accagtcggg cggtcactcc cagcacaagc acgcccttca gttccacaac 4380 
gagcagggcg atctcttcat tgatgtccag gcttcggtca tcgccacgga cagccttgcc 44 40 
ttctaa 4446 

<210> 70 
<211> 1481 
<212> PRT 

<213> Schi2ochytriujn aggregatum 
<400> 70. 

Arg Cys Arg Arg Val Ser Pro Arg Arg Ala Ala Pro Pro Pro Pro Leu 
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15 



Aia Arg Thr Pro Ala Arg Leu Ala Ala Pro Ala Val Ser Asn Glu Leu 

20 25 30 

Leu Glu Lys Ala Glu Thr Val Val Met Glu Val Leu Ala Ala Lys Thr 

35 40 45 

Gly Tyr Glu Thr Asp Met lie Glu Ser Asp Met Glu Leu Glu Thr Glu 
50 55 60 

Leu Gly He Asp Ser He Lys Arg Val Glu He Leu Ser Glu Val Gin 

65 70 75 80 

Ala Met Leu Asn Val Glu Ala Lys Asp Val Asp Ala Leu Ser Arg Thr 

85 90 95 

Arg Thr Val Gly Glu Val Val Asn Ala Met Lys Ala Glu He Ala Gly 

100 105 110 



Gly Ser Ala Pro Ala Pro Ala Ala Ala Ala Pro Gly Pro Ala Ala Ala 



115 



120 



125 



Ala* Pro Ala Pro Ala Val Ser Ser Glu Leu Leu Glu Lys Ala Glu Thr 
130 135 140 

Val Val Met Glu Val Leu Ala Ala Lys Thr Gly Tyr Glu Thr Asp Met 
145 150 155 160 

He Glu Ser Asp Met Glu Leu Glu Thr Glu Leu Gly He Asp Ser He 
165 170 175 

Lys Arg Val Glu He Leu Ser Glu Val Gin Ala Met Leu Asn Val Glu 
180 185 190 

Ala Lys Asp Val Asp Ala Leu Ser Arg Thr Arg Thr Val Gly Glu Val 
195 200 205 

Val Asp Ala Met Lys Ala Glu He Ala Gly Ser Ser Ala Ser Ala Pro 
210 215 220 

Ala Ala Ala Ala Pro Ala Pro Ala Ala Ala Ala Pro Ala Pro Ala Ala 
225 230 235 240 

Ala Ala Pro Ala Val Ser Asn Glu Leu Leu Glu Lys Ala Glu Thr Val 
245 250 255 



Val Met Glu Val Leu Ala Ala Lys Thr Gly Tyr Glu Thr Asp Met He 
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Glu Ser Asp Met Glu Leu Glu Thr Glu Leu Gly lie Asp Ser He Lys 
275 280 285 

Arg Val Glu He Leu Ser Glu Val Gin Ala Met Leu Asn Val Glu Ala 
290 295 300 

Lys Asp Val Asp Ala Leu Ser Arg Thr Arg Thr Val Gly Glu Val Val 
305 310 315 320 

Asp Ala Met Lys Ala Glu He Ala Gly Gly Ser Ala Pro Ala Pro Ala 
325 330 335 

Ala Ala Ala Pro Ala Pro Ala Ala Ala Ala Pro Ala Val Ser Asn Glu 
340 345 350 

Leu Leu Glu Lys Ala Glu Thr Val Val Met Glu Val Leu Ala Ala Lys 
355 360 365 

Thr Gly Tyr Glu Thr Asp Met He Glu Ser Asp Met Glu Leu Glu Thr 
,370 375 380 

Glu- Leu Gly He Asp Ser He Lys Arg Val Glu He Leu Ser Glu Val 
385 390 395 400 

Gin Ala Met Leu Asn Val Glu Ala Lys Asp Val Asp Ala Leu Ser Arg 
405 410 415 

Thr Arg Thr Val Gly Glu Val Val Asp Ala Met Lys Ala Glu He Ala 
420 425 430 

Gly Ser Ser Ala Pro Ala Pro Ala Ala Ala Ala Pro Ala Pro Ala Ala 
435 440 445 

Ala Ala Pro Ala Pro Ala Ala Ala Ala Pro Ala Val Ser Ser Glu Leu 
450 455 460 

Leu Glu Lys Ala Glu Thr Val Val Met Glu Val Leu Ala Ala Lys Thr 
465 470 475 480 

Gly Tyr Glu Thr Asp Met He Glu Ser Asp Met Glu Leu Glu Thr Glu 
485 490 495 

Leu Gly He Asp Ser He Lys Arg Val Glu He Leu Ser Glu Val Gin 
500 505 510 



Ala Met Leu Asn Val Glu Ala Lys Asp Val Asp Ala Leu Ser Arg Thr 
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Arg Thr Val Gly Glu Vai Vai Asp Ala Met Lys Ala Glu He Ala Gly 
530 535 540 

Gly Ser Ala Pro Ala Pro Ala Ala Ala Ala Pro Ala Pro Ala Ala Ala 
545 550 555 560 

Ala Pro Ala Val Ser Asn Glu Leu Leu Glu Lys Ala Glu Thr Val Val 
565 570 575 

Met Glu Val Leu Ala Ala Lys Thr Gly Tyr Glu Thr Asp Met He Glu 
580 585 590 

Ser Asp Met Glu Leu Glu Thr Glu Leu Gly He Asp Ser He Lys Arg 
595 600 605 

Val Glu He Leu Ser Glu Val Gin Ala Met Leu Asn Val Glu Ala Lys 
610 615 620 

Asp Val Asp Ala Leu Ser Arg Thr Arg Thr Val Gly Glu Val Val Asp 
625- 630 635 640 

Ala Met Lys Ala Glu He Ala Gly Gly Ser Ala Pro Ala Pro Ala Ala 
645 650 655 

Ala Ala Pro Ala Ser Ala Gly Ala Ala Pro Ala Val Lys He Asp Ser 
660 665 670 

Val His Gly Ala Asp Cys Asp Asp Leu Ser Leu Met His Ala Lys Val 
675 680 685 

Val Asp He Arg Arg Pro Asp Glu Leu He Leu Glu Arg Pro Glu Asn 
690 695 700 

Arg Pro Val Leu Val Val Asp Asp Gly Ser Glu Leu Thr Leu Ala Leu 
705 710 715 720 

Val Arg Val Leu Gly Ala Cys Ala Val Val Leu Thr Phe Glu Gly Leu 
725 730 735 

Gin Leu Ala Gin Arg Ala Gly Ala Ala Ala He Arg His Val Leu Ala 
740 745 750 

Lys Asp Leu Ser Ala Glu Ser Ala Glu Lys Ala He Lys Glu Ala Glu 
755 760 765 



Gin Arg Phe Gly Ala Leu Gly Gly Phe He Ser Gin Gin Ala Glu Arg 
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Phe Glu Pro Ala GIu lie Leu Gly Phe Thr Leu Met Cys Ala Lys Phe 
785 790 795 800 

Ala Lys Ala Ser Leu Cys Thr Ala Val Ala Gly Gly Arg Pro Ala Phe 
805 810 815 

He Gly Val Ala Arg Leu Asp Gly Arg Leu Gly Phe Thr Ser Gin Gly 
820 825 830 

Thr Ser Asp Ala Leu Lys Arg Ala Gin Arg Gly Ala lie Phe Gly Leu 
835 840 845 

Cys Lys Thr He Gly Leu Glu Trp Ser Glu Ser Asp Val Phe Ser Arg 
850 855 860 

Gly Val Asp He Ala Gin Gly Met His Pro Glu Asp Ala Ala Val Ala 
865 870 875 880 

He Val Arg Glu Met Ala Cys Ala Asp He Arg lie Arg Glu Val Gly 
885 890 895 

He* Gly Ala Asn Gin Gin Arg Cys Thr He Arg Ala Ala Lys Leu Glu 
900 905 910 

Thr Gly Asn Pro Gin Arg Gin He Ala Lys Asp Asp Val Leu Leu Val 
915 920 925 

Ser Gly Gly Ala Arg Gly He Thr Pro Leu Cys He Arg Glu He Thr 
930 935 940 

Arg Gin He Ala Gly Gly Lys Tyr He Leu Leu Gly Arg Ser Lys Val 
945 950 955 960 

Ser Ala Ser Glu Pro Ala Trp Cys Ala Gly He Thr Asp Glu Lys Ala 
965 970 975 

Val Gin Lys Ala Ala Thr Gin Glu Leu Lys Arg Ala Phe Ser Ala Gly 
980 985 990 

Glu Gly Pro Lys Pro Thr Pro Arg Ala Val Thr Lys Leu Val Gly Ser 
995 1000 1005 

Val Leu Gly Ala Arg Glu Val Arg Ser Ser He Ala Ala He Glu Ala 
1010 1015 1020 



Leu Gly Gly Lys Ala He Tyr Ser Ser Cys Asp Val Asn Ser Ala Ala 
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Asp Val Ala Lys Ala Val Arg Asp Ala Glu Ser Gin Leu Gly Ala Arg 
1045 1050 1055 

Val Ser Gly lie Val His Ala Ser Gly Val Leu Arg Asp Arg Leu He 
1060 1065 1070 

Glu Lys Lys Leu Pro Asp Glu Phe Asp Ala Val Phe Gly Thr Lys Val 
1075 1080 1085 

Thr Gly Leu Glu Asn Leu Leu Ala. Ala Val Asp Arg Ala Asn Leu Lys 
1090 1095 1100 

His Met Val Leu Phe Ser Ser Leu Ala Gly Phe His Gly Asn Val Gly 
1105 1110 1115 1120 

Gin Ser Asp Tyr Ala Met Ala Asn Glu Ala Leu Asn Lys Met Gly Leu 
1125 1130 1135 

Glu Leu Ala Lys Asp Val Ser Val Lys Ser He Cys Phe Gly Pro Trp 
1140 1145 1150 

Asp Gly Gly Met Val Thr Pro Gin Leu Lys Lys Gin Phe Gin Glu Met 
1155 1160 1165 

Gly Val Gin He He Pro Arg Glu Gly Gly Ala Asp Thr Val Ala Arg 
1170 1175 1180 

He Val Leu Gly Ser Ser Pro Ala Glu He Leu Val Gly Asn Trp Arg 
1185 1190 1195 1200 

Thr Pro Ser Lys Lys Val Gly Ser Asp Thr He Thr Leu His Arg Lys 
1205 1210 1215 

He Ser Ala Lys Ser Asn Pro Phe Leu Glu Asp His Val He Gin Gly 
1220 1225 1230 

Arg Arg Val Leu Pro Met Thr Leu Ala He Gly Ser Leu Ala Glu Thr 
1235 1240 1245 

Cys Leu Gly Leu Phe Pro Gly Tyr Ser Leu Trp Ala He Asp Asp Ala 
1250 1255 1260 

Gin Leu Phe Lys Gly Val Thr Val Asp Gly Asp Val Asn Cys Glu Val 
1265 1270 1275 1280 

Thr Leu Thr Pro Ser Thr Ala Pro Ser Gly Arg Val Asn Val Gin Ala 
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Thr Leu Lys Thr Phe Ser Ser Gly Lys Leu Val Pro Ala Tyr Arg Ala 
1300 1305 1310 

Val He Val Leu Ser Asn Gin Gly Ala Pro Pro Ala Asn Ala Thr Met 
1315 1320 1325 

Gin Pro Pro Ser Leu Asp Ala Asp Pro Ala Leu Gin Gly Ser Val Tyr 
1330 1335 1340 

Asp Gly Lys Thr Leu Phe His Gly Pro Ala Phe Arg Gly He Asp Asp 
1345 1350 1355 1360 

Val Leu Ser Cys Thr Lys Ser Gin Leu Val Ala Lys Cys Ser Ala Val 
1365 1370 1375 

Pro Gly Ser Asp Ala Ala Arg Gly Glu Phe Ala Thr Asp Thr Asp Ala 
1380 1385 1390 

His Asp Pro Phe Val Asn Asp Leu Ala Phe Gin Ala Met Leu Val Trp 
1395 1400 1405 

Val -Arg Arg Thr Leu Gly Gin Ala Ala Leu Pro Asn Ser -He Gin Arg 
1410 1415 1420 

He Val Gin His Arg Pro Val Pro Gin Asp Lys Pro Phe Tyr He Thr 
1425 1430 1435 1440 

Leu Arg Ser Asn Gin Ser Gly Gly His Ser Gin His Lys His Ala Leu 
1445 1450 1455 

Gin Phe His Asn Glu Gin Gly Asp Leu Phe He Asp Val Gin Ala Ser 
1460 1465 1470 

Val He Ala Thr Asp Ser Leu Ala Phe 
1475 1480 



<210> 71 
<211> 5215 
<212> DNA 

<213> Schizochytr ium aggregatum 
<400> 71 

tgccgtcttt gaggagcatg acccctccaa cgccgcctgc acgggccacg actccatttc 60 
tgcgctctcg gcccgctgcg gcggtgaaag caacatgcgc atcgccatca ctggtatgga 120 
cgccaccttt ggcgctctca agggactcga cgccttcgag cgcgccattt acaccggcgc 180 
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tcacggtgcc atcccactcc cagaaaagcg ctggcgcttt ctcggcaagg acaaggactt 240 
tcttgacctc tgcggcgtca aggccacccc gcacggctgc tacattgaag atgttgaggt 300 
cgacttccag cgcctccgca cgcccatgac ccctgaagac atgctcctcc ctcagcagct 360 
tctggccgtc accaccattg accgcgccat cctcgactcg ggaatgaaaa agggtggcaa 420 
tgtcgccgtc tttgtcggcc tcggcaccga cctcgagctc taccgtcacc gtgctcgcgt 480 
cgctctcaag gagcgcgtcc gccctgaagc ctccaagaag ctcaatgaca tgatgcagta 540 
cattaacgac tgcggcacat ccacatcgca cacctcgtac attggcaacc tcgtcgccac 600 
gcgcgtctcg tcgcagtggg gcttcacggg cccctccttt acgatcaccg agggcaacaa 660 
ctccgtctac cgctgcgccg agctcggcaa gtacctcctc gagaccggcg aggtcgatgg 720 
cgtcgtcgtt gcgggtgtcg atctctgcgg cagtgccgaa aacctttacg tcaagtctcg 780 
ccgcttcaag gtgtccacct ccgatacccc gcgcgccagc tttgacgccg ccgccgatgg 8 40 
ctactttgtc ggcgagggct gcggtgcctt tgtgctcaag cgtgagacta gctgcaccaa 900 
ggacgaccgt atctacgctt gcatggatgc catcgtccct ggcaacgtcc ctagcgcctg 960 
cttgcgcgag gccctcgacc aggcgcgcgt caagccgggc gatatcgaga tgctcgagct 1020 
cagcgccgac tccgcccgcc acctcaagga cccgtccgtc ctgcccaagg agctcactgc 1080 
cgaggaggaa atcggcggcc ttcagacgat ccttcgtgac gatgacaagc tcccgcgcaa 1140 
cgtcgcaacg ggcagtgtca aggccaccgt cggtgacacc ggttatgcct ctggtgctgc 1200 
cagcctcatc aaggctgcgc tttgcatcta caaccgctac ctgcccagca acggcgacga 1260 
ctgggatgaa cccgcccctg aggcgccctg ggacagcacc ctctttgcgt gccagacctc 1320 
gcgcgcttgg ctcaagaacc ctggcgagcg tcgctatgcg gccgtctcgg gcgtctccga 1380 
gacgcgctcg tgctattccg tgctcctctc cgaagccgag ggccactacg agcgcgagaa 1440 
ccgcatctcg ctcgacgagg aggcgcccaa gctcattgtg cttcgcgccg actcccacga 1500 
ggagatcctt ggtcgcctcg acaagatccg cgagcgcttc ttgcagccca cgggcgccgc 1560 
cccgcgcgag tccgagctca aggcgcaggc ccgccgcatc ttcctcgagc tcctcggcga 1620 
gacccttgcc caggatgccg cttcttcagg ctcgcaaaag cccctcgctc tcagcctcgt 1680 
ctccacgccc tccaagctcc agcgcgaggt cgagctcgcg gccaagggta tcccgcgctg 1740 
cctcaagatg cgccgcgatt ggagctcccc tgctggcagc cgctacgcgc ctgagccgct 1800 
cgccagcgac cgcgtcgcct tcatgtacgg cgaaggtcgc agcccttact acggcatcac 18 60 
ccaagacatt caccgcattt ggcccgaact ccacgaggtc atcaacgaaa agacgaaccg 1920 
tctctgggcc gaaggcgacc gctgggtcat gccgcgcgcc agcttcaagt cggagctcga 1980 
gagccagcag caagagtttg atcgcaacat gattgaaatg ttccgtcttg gaatcctcac 2040 
ctcaattgcc ttcaccaatc tggcgcgcga cgttctcaac atcacgccca aggccgcctt 2100 
tggcctcagt cttggcgaga tttccatgat ttttgccttt tccaagaaga acggtctcat 2160 
ctccgaccag ctcaccaagg atcttcgcga gtccgacgtg tggaacaagg ctctggccgt 2220 
tgaatttaat gcgctgcgcg aggcctgggg cattccacag agtgtcccca aggacgagtt 2280 
ctggcaaggc tacattgtgc gcggcaccaa gcaggatatc gaggcggcca tcgccccgga 2340 
cagcaagtac gtgcgcctca ccatcatcaa tgatgccaac accgccctca ttagcggcaa 2400 
gcccgacgcc tgcaaggctg cgatcgcgcg tctcggtggc aacattcctg cgcttcccgt 2460 
gacccagggc atgtgcggcc ' actgccccga ggtgggacct tataccaagg atatcgccaa 2520 
gatccatgcc aaccttgagt tccccgttgt cgacggcctt gacctctgga ccacaatcaa 2580 
ccagaagcgc ctcgtgccac gcgccacggg cgccaaggac gaatgggccc cttcttcctt 2640 
tggcgagtac gccggccagc tctacgagaa gcaggctaac ttcccccaaa tcgtcgagac 2700 
catttacaag caaaactacg acgtctttgt cgaggtcggg cccaacaacc accgtagcac 2760 
cgcagtgcgc accacgcttg gtccccagcg caaccacctt gctggcgcca tcgacaagca 2820 
gaacgaggat gcttggacga ccatcgtcaa gcttgtggct tcgctcaagg cccaccttgt 2880 
tcctggcgtc acgatctcgc cgctgtacca ctccaagctt gtggcggagg ctcaggcttg 2940 
ctacgctgcg ctctgcaagg gtgaaaagcc caagaagaac aagtttgtgc gcaagattca 3000 
gctcaacggt cgcttcaaca gcaaggcgga ccccatctcc tcggccgatc ttgccagctt 3060 
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tccgcctgcg gaccctgcca ttgaagccgc catctcgagc cgcatcatga agcctgtcgc 3120 
tcccaagttc tacgcgcgtc tcaacattga cgagcaggac gagacccgag atccgatcct 3180 
caacaaggac aacgcgccgt cttcttcttc ttcttcttct tcttcttctt cttcttcttc 3240 
ttctccgtcg cctgctcctt cggcccccgt gcaaaagaag gctgcccccg ccgcggagac 3300 
caaggctgtt gcttcggctg acgcacttcg cagtgccctg ctcgatctcg acagtatgct 3360 
tgcgctgagc tctgccagtg cctccggcaa ccttgttgag actgcgccta gcgacgcctc 3420 
ggtcattgtg ccgccctgca acattgcgga tctcggcagc cgcgccttca tgaaaacgta 3480 
cggtgtttcg gcgcctctgt acacgggcgc catggccaag ggcattgcct ctgcggacct 3540 
cgtcattgcc gccggccgcc agggcatcct tgcgtccttt ggcgccggcg gacttcccat 3600 
gcaggttgtg cgtgagtcca tcgaaaagat tcaggccgcc ctgcccaatg gcccgtacgc 3660 
tgtcaacctt atccattctc cctttgacag caacctcgaa aagggcaatg tcgatctctt 3720 
cctcgagaag ggtgtcacct ttgtcgaggc ctcggccttt atgacgctca ccccgcaggt 3780 
cgtgcggtac cgcgcggctg gcctcacgcg caacgccgac ggctcggtca acatccgcaa 3840 
ccgtatcatt ggcaaggtct cgcgcaccga gctcgccgag atgttcatgc gtcctgcgcc 3900 
cgagcacctt cttcagaagc tcattgcttc cggcgagatc aaccaggagc aggccgagct 3960 
cgcccgccgt gttcccgtcg ctgacgacat cgcggtcgaa gctgactcgg gtggccacac 4020 
cgacaaccgc cccatccacg tcattctgcc cctcatcatc aaccttcgcg accgccttca 4080 
ccgcgagtgc ggctacccgg ccaaccttcg cgtccgtgtg ggcgccggcg gtggcattgg 4140 
gtgcccccag gcggcgctgg ccaccttcaa catgggtgcc tcctttattg tcaccggcac 4200 
cgtgaaccag gtcgccaagc agtcgggcac gtgcgacaat gtgcgcaagc agctcgcgaa 4260 
ggccacttac tcggacgtat gcatggcccc ggctgccgac atgttcgagg aaggcgtcaa 4320 
gcttcaggtc ctcaagaagg gaaccatgtt tccctcgcgc gccaacaagc tctacgagct 4380 
cttttgcaag tacgactcgt tcgagtccat gccccccgca gagcttgcgc gcgtcgagaa 4440 
gcgcatcttc agccgcgcgc tcgaagaggt ctgggacgag accaaaaact tttacattaa 4500 
ccgtcttcac aacccggaga agatccagcg cgccgagcgc gaccccaagc tcaagatgtc 4560 
gctgtgcttt cgctggtacc tgagcctggc gagccgctgg gccaacactg gagcttccga 4620 
tcgcgtcatg gactaccagg tctggtgcgg tcctgccatt ggttccttca acgatttcat 4680 
caagggaact taccttgatc cggccgtcgc aaacgagtac ccgtgcgtcg ttcagattaa 4740 
caagcagatc cttcgtggag cgtgcttctt gcgccgtctc gaaattctgc gcaacgcacg 4800 
cctttccgat ggcgctgccg ctcttgtggc cagcatcgat gacacatacg tcccggccga 4860 
gaagctgtaa gtaagctctc atatatgtta gttgcgtgag accgacacga agataatatc 4920 
acatacgctt ttgtttgttc tttcaattat ttgtctgtgc ttcatgttgc tcctcagtat 4980 
ctagctggcg gctcttatct tcttttaaaa tatctggaca aggacaaaaa caagaataaa 5040 
ggcgagaaga tgtgaatttc atttcgactt gagaactcga agagcattga tgcggttagt 5100 
atatgggtat tttccagaca cttttcatca tcatcatcat catcatcatt atgaagaagt 5160 
agtagctgat aaagtagact cactgtttgc agcgagaaaa aaaaaaaaaa aaaaa 5215 

<210> 72 
<211> 1622 
<212> PRT 

<213> Schizochytrium aggregatum 
<400> 72 

Ala Val Phe Glu Glu His Asp Pro Ser Asn Ala Ala Cys Thr Gly His 
15 10 15 

Asp Ser He Ser Ala Leu Ser Ala Arg Cys Gly Gly Glu Ser Asn Met 
20 25 30 
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Arg lie Ala lie Thr Gly Met Asp Ala Thr Phe Gly Ala Leu Lys Gly 
35 40 

Leu Asp Ala Phe Glu Arg Ala He Tyr Thr Gly Ala His Gly Ala lie 
50 55 60 

Pro Leu Pro Glu Lys Arg Trp Arg Phe Leu Gly Lys Asp Lys Asp Phe 
65 70 75 80 

Leu Asp Leu Cys Gly Val Lys Ala Thr Pro His Gly Cys Tyr He Glu 
85 90 95 

Asp Val Glu Val Asp Phe Gin Arg Leu Arg Thr Pro Met Thr Pro Glu 
100 105 110 

Asp Met Leu Leu Pro Gin Gin Leu Leu Ala Val Thr Thr He Asp Arg 
115 120 125 

Ala lie Leu Asp Ser Gly Met Lys Lys Gly Gly Asn Val Ala Val Phe 
130 135 140 

Val Gly Leu Gly Thr Asp Leu Glu Leu Tyr Arg His Arg Ala Arg Val 
145- 150 155 160 

Ala Leu Lys Glu Arg Val Arg Pro Glu Ala Ser Lys Lys Leu Asn Asp 
165 170 175 

Met Met Gin Tyr He Asn Asp Cys Gly Thr Ser Thr Ser Tyr Thr Ser 
180 185 190 

Tyr He Gly Asn Leu Val Ala Thr Arg Val Ser Ser Gin Trp Gly Phe 
195 200 205 

Thr Gly Pro Ser Phe Thr He Thr Glu Gly Asn Asn Ser Val Tyr Arg 
210 215 220 

Cys Ala Glu Leu Gly Lys Tyr Leu Leu Glu Thr Gly Glu Val Asp Gly 
225 230 235 240 

Val Val Val Ala Gly Val Asp Leu Cys Gly Ser Ala Glu Asn Leu Tyr 
245 250 255 

Val Lys Ser Arg Arg Phe Lys Val Ser Thr Ser Asp Thr Pro Arg Ala 
260 265 270 

Ser Phe Asp Ala Ala Ala Asp Gly Tyr Phe Val Gly Glu Gly Cys Gly 
275 280 285 
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Ala Phe Val Leu Lys Arg Glu Thr Ser Cys Thr Lys Asp Asp Arg lie 
290 295 300 

Tyr Ala Cys Met Asp Ala He Val Pro Gly Asn Val Pro Ser Ala Cys 
305 310 315 320 

Leu Arg Glu Ala Leu Asp Gin Ala Arg Val Lys Pro Gly Asp He Glu 
325 330 335 

Met Leu Glu Leu Ser Ala Asp Ser Ala Arg His Leu Lys Asp Pro Ser 
340 345 350 

Val Leu Pro Lys Glu Leu Thr Ala Glu Glu Glu He Gly Gly Leu Gin 
355 360 365 

Thr He Leu Arg Asp Asp Asp Lys Leu Pro Arg Asn Val Ala Thr Gly 
370 375 380 

Ser Val Lys Ala Thr Val Gly Asp Thr Gly Tyr Ala Ser Gly Ala Ala 
385 390 395 400 

Ser Leu He Lys Ala Ala Leu Cys He Tyr Asn Arg Tyr Leu Pro Ser 
405 410 415 

Asn Gly Asp Asp Trp Asp Glu Pro Ala Pro Glu Ala Pro Trp Asp Ser 
420 425 430 

Thr Leu Phe Ala Cys Gin Thr Ser Arg Ala Trp Leu Lys Asn Pro Gly 
435 440 445 

Glu Arg Arg Tyr Ala Ala Val Ser Gly Val Ser Glu Thr Arg Ser Cys 
450 455 460 

Tyr Ser Val Leu Leu Ser Glu Ala Glu Gly His Tyr Glu Arg Glu Asn 
465 470 475 480 

Arg He Ser Leu Asp Glu Glu Ala Pro Lys Leu He Val Leu Arg Ala 
485 490 495 

Asp Ser His Glu Glu He Leu Gly Arg Leu Asp Lys He Arg Glu Arg 
500 505 510 

Phe Leu Gin Pro Thr Gly Ala Ala Pro Arg Glu Ser Glu Leu Lys Ala 
515 520 525 

Gin Ala Arg Arg He Phe Leu Glu Leu Leu Gly Glu Thr Leu Ala Gin 
530 535 540 



96 



wo 00/42195 



PCT/USOO/00956 



Asp Ala Ala Ser Ser Gly Ser Gin Lys Fro Leu Ala Leu Ser Leu Val 
545 550 555 560 

Ser Thr Pro Ser Lys Leu Gin Arg Glu Val Glu Leu Ala Ala Lys Gly 
565 570 575 

lie Pro Arg Cys Leu Lys Met Arg Arg Asp Trp Ser Ser Pro Ala Gly 
580 585 590 

Ser Arg Tyr Ala Pro Glu Pro Leu Ala Ser Asp Arg Val Ala Phe Met 
595 600 605 

Tyr Gly Glu Gly Arg Ser Pro Tyr Tyr Gly lie Thr Gin Asp lie His 
610 615 620 

Arg lie Trp Pro Glu Leu His GLu Val lie Asn Glu Lys Thr Asn Arg 
625 630 635 640 

Leu Trp Ala Glu Gly Asp Arg Trp Val Met Pro Arg Ala Ser Phe Lys 
645 650 655 

Ser Glu Leu Glu Ser Gin Gin Gin Glu Phe Asp Arg Asn Met lie Glu 
660 665 670 

Met Phe Arg Leu Gly He Leu Thr Ser He Ala Phe Thr Asn Leu Ala 
675 680 685 

Arg Asp Val Leu Asn He Thr Pro Lys Ala Ala Phe Gly Leu Ser Leu 
690 695 700 

Gly Glu He Ser Met He Phe Ala Phe Ser Lys Lys Asn Gly Leu He 
70S 710 715 720 

Ser Asp Gin Leu Thr Lys Asp Leu Arg Glu Ser Asp Val Trp Asn Lys 
725 730 735 

Ala Leu Ala Val Glu Phe Asn Ala Leu Arg Glu Ala Trp Gly He Pro 
740 745 750 

Gin Ser Val Pro Lys Asp Glu Phe Trp Gin Gly Tyr He Val Arg Gly 
755 760 765 

Thr Lys Gin Asp He Glu Ala Ala He Ala Pro Asp Ser Lys Tyr Val 
770 775 780 

Arg Leu Thr He He Asn Asp Ala Asn Thr Ala Leu He Ser Gly Lys 
785 790 795 800 
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Pro Asp Ala Cys Lys Ala Ala He Ala Arg Leu Gly Gly Asn He Pro 
805 810 815 

Ala Leu Pro Val Thr Gin Gly Met Cys Gly His Cys Pro Glu Val Gly 
820 825 830 

Pro Tyr Thr Lys Asp He Ala Lys He His Ala Asn Leu Glu Phe Pro 
835 840 845 

Val Val Asp Gly Leu Asp Leu Trp Thr Thr He Asn Gin Lys Arg Leu 
850 855 860 

Val Pro Arg Ala Thr Gly Ala Lys Asp Glu Trp Ala Pro Ser Ser Phe 
865 870 875 880 

Gly Glu Tyr Ala Gly Gin Leu Tyr Glu Lys Gin Ala Asn Phe Pro Gin 
885 890 895 

He Val Glu Thr He Tyr Lys Gin Asn Tyr Asp Val Phe Val Glu Val 
900 905 910 

Gly Pro Asn Asn His Arg Ser Thr Ala Val Arg Thr Thr Leu Gly Pro 
915 920 925 

Gin Arg Asn His Leu Ala Gly Ala He Asp Lys Gin Asn Glu Asp Ala 
930 935 940 

Trp Thr Thr He Val Lys Leu Val Ala Ser Leu Lys Ala His Leu Val 
945 950 955 960 

Pro Gly Val Thr He Ser Pro Leu Tyr His Ser Lys Leu Val Ala Glu 
965 970 975 

Ala Gin Ala Cys Tyr Ala Ala Leu Cys Lys Gly Glu Lys Pro Lys Lys 
980 985 990 

Asn Lys Phe Val Arg Lys He Gin Leu Asn Gly Arg Phe Asn Ser Lys 
995 1000 1005 

Ala Asp Pro He Ser Ser Ala Asp Leu Ala Ser Phe Pro Pro Ala Asp 
1010 1015 1020 

Pro Ala He Glu Ala Ala He Ser Ser Arg He Met Lys Pro Val Ala 
1025 1030 1035 1040 

Pro Lys Phe Tyr Ala Arg Leu Asn He Asp Glu Gin Asp Glu Thr Arg 
1045 1050 1055 
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Asp Pro lie Leu Asn Lys Asp Asn Ala Pro Ser Ser Ser Ser Ser Ser 
1060 1065 1070 

Ser Ser Ser Ser Ser Ser Ser Ser Ser Pro Ser Pro Ala Pro Ser Ala 
1075 1080 1085 

Pro Vai Gin Lys Lys Ala Ala Pro Ala Ala Glu Thr Lys Ala Val Ala 
1090 1095 1100 

Ser Ala Asp Ala Leu Arg Ser Ala Leu Leu Asp Leu Asp Ser Met Leu 
1105 1110 1115 1120 

Ala Leu Ser Ser Ala Ser Ala Ser Gly Asn Leu Val Glu Thr Ala Pro 
1125 1130 1135 

Ser Asp Ala Ser Val lie Val Pro Pro Cys Asn He Ala Asp Leu Gly 
1140 1145 1150 

Ser Arg Ala Phe Met Lys Thr Tyr Gly Val Ser Ala Pro Leu Tyr Thr 
1155 1160 1165 

Gly Ala Met Ala Lys Gly lie Ala Ser Ala Asp Leu Val lie Ala Ala 
1-170 1175 1180 

Gly Arg Gin Gly He Leu Ala Ser Phe Gly Ala Gly Gly Leu Pro Met 
1185 1190 1195 1200 

Gin Val Val Arg Glu Ser lie Glu Lys He Gin Ala Ala Leu Pro Asn 
1205 1210 1215 

Gly Pro Tyr Ala Val Asn Leu He His Ser Pro Phe Asp Ser Asn Leu 
1220 1225 1230 

Glu Lys Gly Asn Val Asp Leu Phe Leu Glu Lys Gly Val Thr Phe Val 
1235 1240 1245 

Glu Ala Ser Ala Phe Met Thr Leu Thr Pro Gin Val Val Arg Tyr Arg 
1250 1255 1260 

Ala Ala Gly Leu Thr Arg Asn Ala Asp Gly Ser Val Asn He Arg Asn 
1265 1270 1275 1280 

Arg He He Gly Lys Val Ser Arg Thr Glu Leu Ala Glu Met Phe Met 
1285 1290 1295 

Arg Pro Ala Pro Glu His Leu Leu Gin Lys Leu He Ala Ser Gly Glu 
1300 1305 1310 
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lie Asn Gin Glu Gin Ala Glu Leu Ala Arg Arg Val Pro Val Ala Asp 
1315 1320 1325 

Asp He Ala Val Glu Ala Asp Ser Gly Gly His Thr Asp Asn Arg Pro 
1330 133S 1340 

He His Val He Leu Pro Leu lie lie Asn Leu Arg Asp Arg Leu His 
1345 1350 1355 1360 

Arg Glu Cys Gly Tyr Pro Ala Asn Leu Arg Val Arg Val Gly Ala Gly 
1365 1370 1375 

Gly Gly He Gly Cys Pro Gin Ala Ala Leu Ala Thr Phe Asn Met Gly 
1380 1385 1390 

Ala Ser Phe He Val Thr Gly Thr Val Asn Gin Val Ala Lys Gin Ser 
1395 1400 1405 

Gly Thr Cys Asp Asn Val Arg Lys Gin Leu Ala Lys Ala Thr Tyr Ser 
1410 1415 1420 

Asp Val Cys Met Ala Pro Ala Ala Asp Met Phe Glu Glu Gly Val Lys 
1425 1430 1435 1440 

Leu Gin Val Leu Lys Lys Gly Thr Met Phe Pro Ser Arg Ala Asn Lys 
1445 1450 1455 

Leu Tyr Glu Leu Phe Cys Lys Tyr Asp Ser Phe Glu Ser Met Pro Pro 
1460 1465 1470 

Ala Glu Leu Ala Arg Val Glu Lys Arg He Phe Ser Arg Ala Leu Glu 
1475 1480 1485 

Glu Val Trp Asp Glu Thr Lys Asn Phe Tyr He Asn Arg Leu His Asn 
1490 1495 1500 

Pro Glu Lys He Gin Arg Ala Glu Arg Asp Pro Lys Leu Lys Met Ser 
1505 1510 1515 1520 

Leu Cys Phe Arg Trp Tyr Leu Ser Leu Ala Ser Arg Trp Ala Asn Thr 
1525 1530 1535 

Gly Ala Ser Asp Arg Val Met Asp Tyr Gin Val Trp Cys Gly Pro Ala 
1540 1545 1550 

He Gly Ser Phe Asn Asp Phe He Lys Gly Thr Tyr Leu Asp Pro Ala 
1555 1560 1565 
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Val Ala Asn Glu Tyr Pro Cys Val Val Gin lie Asn Lys Gin He Leu 
1570 1575 1580 

Arg Gly Ala Cys Phe Leu Arg Arg Leu Glu He Leu Arg Asn Ala Arg 
1585 1590 1595 1600 

Leu Ser Asp Gly Ala Ala Ala Leu Val Ala Ser He Asp Asp Thr Tyr 
1605 1610 1615 

Val Pro Ala Glu Lys Leu 
1620 



<210> 73 
<211> 1551 
<212> PRT 

<213> Schizochytrium aggregatum 
<400> 73 

Arg Ala Glu Ala Gly Arg Glu Pro Glu Pro Ala Pro Gin lie Thr Ser 
1 5 10 15 

Thr' Ala Ala Glu Ser Gin Gin Gin Gin Gin Gin Gin Gin Gin Gin Gin 
20 25 30 

Gin Gin Gin Gin Pro Arg Glu Gly Asp Lys Glu Lys Ala Ala Glu Thr 
35 40 45 

Met Ala Leu Arg Val Lys Thr Asn Lys Lys Pro Cys Trp Glu Met Thr 
50 55 60 

Lys Glu Glu Leu Thr Ser Gly Lys Thr Glu Val Phe Asn Tyr Glu Glu 
65 70 75 80 

Leu Leu Glu Phe Ala Glu Gly Asp He Ala Lys Val Phe Gly Pro Glu 
85 90 95 

Phe Ala Val He Asp Lys Tyr Pro Arg Arg Val Arg Leu Pro Ala Arg 
100 105 110 

Glu Tyr Leu Leu Val Thr Arg Val Thr Leu Met Asp Ala Glu Val Asn 
115 120 125 

Asn Tyr Arg Val Gly Ala Arg Met Val Thr Glu Tyr Asp Leu Pro Val 
130 135 140 

Asn Gly Glu Leu Ser Glu Gly Gly Asp Cys Pro Trp Ala Val Leu Val 
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150 
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160 



Giu Ser Gly Gin Cys Asp Leu Met Leu lie Ser Tyr Met Gly lie Asp 
165 170 175 

Phe Gin Asn Gin Gly Asp Arg Val Tyr Arg Leu Leu Asn Thr Thr Leu 
180 185 190 

Thr Phe Tyr Gly Val Ala His Glu Gly Glu Thr Leu Glu Tyr Asp lie 
195 200 205 

Arg Val Thr Gly Phe Ala Lys Arg Leu Asp Gly Gly lie Ser Met Phe 
210 215 220 

Phe Phe Glu Tyr Asp Cys Tyr Val Asn Gly Arg Leu Leu lie Glu Met 
225 230 235 240 

Arg Asp Gly Cys Ala Gly Phe Phe Thr Asn Glu Glu Leu Asp Ala Gly 
245 250 255 

Lys Gly Val Val Phe Thr Arg Gly Asp Leu Ala Ala Arg Ala Lys He 
260 265 270 

Pro -Lys Gin Asp Val Ser Pro Tyr Ala Val Ala Pro Cys Leu His Lys 
275 280 285 

Thr Lys Leu Asn Glu Lys Glu Met Gin Thr Leu Val Asp Lys Asp Trp 
290 295 300 

Ala Ser Val Phe Gly Ser Lys Asn Gly Met Pro Glu He Asn Tyr Lys 
305 310 315 320 

Leu Cys Ala Arg Lys Met Leu Met He Asp Arg Val Thr Ser He Asp 
325 330 335 

His Lys Gly Gly Val Tyr Gly Leu Gly Gin Leu Val Gly Glu Lys He 
340 345 350 

Leu Glu Arg Asp His Trp Tyr Phe Pro Cys His Phe Val Lys Asp Gin 
355 360 365 

Val Met Ala Gly Ser Leu Val Ser Asp Gly Cys Ser Gin Met Leu Lys 
370 375 380 

Met Tyr Met He Trp Leu Gly Leu His Leu Thr Thr Gly Pro Phe Asp 
385 390 395 400 



Phe Arg Pro Val Asn Gly His Pro Asn Lys Val Arg Cys Arg Gly Gin 
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415 



He Ser Pro His Lys Gly Lys Leu Val Tyr Val Met Giu He Lys Glu 
420 425 430 

Met Gly Phe Asp Glu Asp Asn Asp Pro Tyr Ala He Ala Asp Val Asn 
435 440 445 

He He Asp Val Asp Phe Glu Lys Gly Gin Asp Phe Ser Leu Asp Arg 
450 455 460 

He Ser Asp Tyr Gly Lys Gly Asp Leu Asn Lys Lys He Val Val Asp 
465 470 475 480 

Phe Lys Gly He Ala Leu Lys Met Gin Lys Arg Ser Thr Asn Lys Asn 
485 490 495 

Pro Ser Lys Val Gin Pro Val Phe Ala Asn Gly Ala Ala Thr Val Gly 
500 505 510 

Pro Glu Ala Ser Lys Ala Ser Ser Gly Ala Ser Ala Ser Ala Ser Ala 
515 520 525 

Ala -Pro Ala Lys Pro Ala Phe Ser Ala Asp Val Leu Ala Pro Lys Pro 
530 535 540 

Val Ala Leu Pro Glu His He Leu Lys Gly Asp Ala Leu Ala Pro Lys 
545 550 555 560 

Glu Met Ser Trp His Pro Met Ala Arg He Pro Gly Asn Pro Thr Pro 
565 570 575 

Ser Phe Ala Pro Ser Ala Tyr Lys Pro Arg Asn He Ala Phe Thr Pro 
580 585 590 

Phe Pro Gly Asn Pro Asn Asp Asn Asp His Thr Pro Gly Lys Met Pro 
595 600 605 

Leu Thr Trp Phe Asn Met Ala Glu Phe Met Ala Gly Lys Val Ser Met 
610 615 620 

Cys Leu Gly Pro Glu Phe Ala Lys Phe Asp Asp Ser Asn Thr Ser Arg 
625 630 635 640 

Ser Pro Ala Trp Asp Leu Ala Leu Val Thr Arg Ala Val Ser Val Ser 
645 650 655 



Asp Leu Lys His Val Asn Tyr Arg Asn He Asp Leu Asp Pro Ser Lys 
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67 0 



Gly Thr Met Val Gly Glu Phe Asp Cys Pro Ala Asp Ala Trp Phe Tyr 
675 680 685 

Lys Gly Ala Cys Asn Asp Ala His Met Pro Tyr Ser lie Leu Met Glu 
690 695 700 

He Ala Leu Gin Thr Ser Gly Val Leu Thr Ser Val Leu Lys Ala Pro 
705 710 715 720 

Leu Thr Met Glu Lys Asp Asp He Leu Phe Arg Asn Leu Asp Ala Asn 
725 730 735 

Ala Glu Phe Val Arg Ala Asp Leu Asp Tyr Arg Gly Lys Thr He Arg 
740 745 750 

Asn Val Thr Lys Cys Thr Gly Tyr Ser Met Leu Gly Glu Met Gly Val 
755 'JSO 765 

His Arg Phe Thr Phe Glu Leu Tyr Val Asp Asp Val Leu Phe Tyr Lys 
.770 775 780 

Gly Ser Thr Ser Phe Gly Trp Phe Val Pro Glu Val Phe Ala Ala Gin 
785 790 795 800 

Ala Gly Leu Asp Asn Gly Arg Lys Ser Glu Pro Trp Phe He Glu Asn 
805 810 815 

Lys Val Pro Ala Ser Gin Val Ser Ser Phe Asp Val Arg Pro Asn Gly 
820 825 830 

Ser Gly Arg Thr Ala He Phe Ala Asn Ala Pro Ser Gly Ala Gin Leu 
835 840 845 

Asn Arg Arg Thr Asp Gin Gly Gin Tyr Leu Asp Ala Val Asp He Val 
850 855 860 

Ser Gly Ser Gly Lys Lys Ser Leu Gly Tyr Ala His Gly Ser Lys Thr 
865 870 875 880 

Val Asn Pro Asn Asp Trp Phe Phe Ser Cys His Phe Trp Phe Asp Ser 
885 890 895 

Val Met Pro Gly Ser Leu Gly Val Glu Ser Met Phe Gin Leu Val Glu 
900 905 910 



Ala He Ala Ala His Glu Asp Leu Ala Gly Lys Ala Arg His Cys Gin 



104 



wo 00/42195 PCT/USOO/00956 

915 920 925 

pro His Leu Cys Ala Atg Pro Arg Ala Arg Ser Ser Trp Lys Tyr Arg 



930 



935 940 



Gly Gin Leu Thr Pro Lys Ser Lys Lys Met Asp Ser Glu Val His He 
945 950 955 960 

Val Ser Val Asp Ala His Asp Gly Val Val Asp Leu Val Ala Asp Gly 
965 970 975 

Phe Leu Trp Ala Asp Ser Leu Arg Val Tyr Ser Val Ser Asn He Arg 
980 .985 990 

Val Arg He Ala Ser Gly Glu Ala Pro Ala Ala Ala Ser Ser Ala Ala 
995 1000 1005 

Ser Val Gly Ser Ser Ala Ser Ser Val Glu Arg Thr Arg Ser Ser Pro 
1010 1015 1020 



Ala Val Ala Ser Gly Pro Ala Gin Thr He Asp Leu Lys Gin Leu Lys 
1025 



1030 1035 1040 



Thr Glu Leu Leu Glu Leu Asp Ala Pro Leu Tyr Leu Ser Gin Asp Pro 
1045 1050 1055 

Thr Ser Gly Gin Leu Lys Lys His Thr Asp Val Ala Ser Gly Gin Ala 
1060 1065 1070 

Thr He Val Gin Pro Cys Thr Leu Gly Asp Leu Gly Asp Arg Ser Phe 
1075 1080 10B5 

Met Glu Thr Tyr Gly Val Val Ala Pro Leu Tyr Thr Gly Ala Met Ala 
1090 1095 1100 

Lys Gly He Ala Ser Ala Asp Leu Val He Ala Ala Gly Lys Arg Lys 
1105 1110 1115 1120 

He Leu Gly Ser Phe Gly Ala Gly Gly Leu Pro Met His His Val Arg 
1125 1130 1135 

Ala Ala Leu Glu Lys He Gin Ala Ala Leu Pro Gin Gly Pro Tyr Ala 
1140 1145 1150 

Val Asn Leu He His Ser Pro Phe Asp Ser Asn Leu Glu Lys Gly Asn 
1155 1160 1165 

Val Asp Leu Phe Leu Glu Lys Gly Val Thr Val Val Glu Ala Ser Ala 
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P.e Met Thr Leu Thr Pro Gin Val Val Ar. Tyr Arg Ala Ala Cly Leu 



1185 



nqc 1200 
1190 ^^^^ 



Ser Arg Asn Ala Asp Gly Ser Val Asn He Arg Asn Arg He lie Gly 

1205 



Lys val ser Arg Thr Glu Leu Ala Glu Met Phe lie Arg Pro Ala Pro 

1230 

1220 ^^^^ 

Clu His Leu Leu Glu Lys Leu He Ala Ser Gly Glu lie Thr Gin Glu 
1235 ^240 1245 

Gin Ala Glu Leu Ala Arg Arg Val Pro Val Ala Asp Asp He Ala Val 
1250 

Glu Ala ASP ser Gly Gly His Thr Asp Asn^Arg Pro He His Val^He 



1265 



1270 



Leu Pro Le 

1285 



u He He Asn Leu Arg Asn Arg Leu His Arg Glu Cys Gly 
1285 1290 1295 

Tyr Pro Ala His Leu Arg Val Arg Val Gly Ala Gly Gly Gly Val Gly 
1300 1305 1310 

cys Pro Gin Ala Ala Ala Ala Ala Leu Thr Met Gly Ala Ala Phe He 
1315 1320 1325 

val Thr Gly Thr Val Asn Gin Val Ala Lys Gin Ser Gly Thr Cys Asp 
1330 1335 13^0 

Asn val Arg Lys Gin Leu Ser. Gin Ala Thr Tyr Ser Asp He Cys Met 



1345 



1350 1355 1360 



Ala Pro Ala Ala Asp Met Phe Glu Glu Gly Val Lys Leu Gin Val Leu 
1365 la-'O 1375 

Lys Lys Gly Thr Met Phe Pro Ser Arg Ala Asn Lys Leu Tyr Glu Leu 
1380 1385 1390 

Phe cys Lys Tyr Asp Ser Phe Asp Ser Met Pro Pro Ala Glu Leu Glu 
1395 1400 1405 

Arc He Glu Lys Arg He Phe Lys Arg Ala Leu Gin Glu Val Trp Glu 
1410 1415 l-'^O 

Glu Thr Lys Asp Phe Tyr He Asn Gly Leu Lys Asn Pro Glu Lys He 
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Gin Arg Ala Glu His Asp Pro Lys Leu Lys Met Ser Leu Cys Phe Arg 
1445 1450 1455 

Trp Tyr Leu Gly Leu Ala Ser Arg Trp Ala Asn Met Gly Ala Pro Asp 
1460 1465 1470 

Arg Val Met Asp Tyr Gin Val Trp Cys Gly Pro Ala He Gly Ala Phe 
1475 1480 1485 

Asn Asp Phe He Lys Gly Thr Tyr Leu Asp Pro Ala Val Ser Asn Glu 
1490 1495 1500 



Tyr Pro Cys Val Val Gin He Asn Leu Gin He Leu Arg Gly Ala Cys 
1505 



1510 1515 1520 



Tyr Leu Arg Arg Leu Asn Ala Leu Arg Asn Asp Pro Arg He Asp Leu 
1525 1530 1535 

Glu Thr Glu Asp Ala Ala Phe Val Tyr Glu Pro Thr Asn Ala Leu 
1540 1545 1550 



<210> 74 
<211> 30 
<212> DNA 

<213> Schizochytriura aggregatum 
<400> 74 

taccgcggca agactatccg caacgtcacc 

<210> 75 
<211> 30 
<212> DNA 

<213> Schizochytrium aggregatum 
<400> 75 

gccgtcgtgg gcgtccacgg acacgatgtg 

<210> 76 
<211> 4767 
<212> DNA 

<213> Schizochytrium aggregatum 
<400> 76 

cgagcagagg ccggccgcga gcccgagccc gcgccgcaga tcactagtac cgctgcggaa 
tcacagcagc agcagcagca gcagcagcag cagcagcagc agcagcagcc acgagaggga 
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gataaagaaa aagcggcaga gacgatggcg ctccgtgtca agacgaacaa gaagccatgc 180 
tgggagatga ccaaggagga gctgaccagc ggcaagaccg aggtgttcaa ctatgaggaa 240 
ctcctcgagt tcgcagaggg cgacatcgcc aaggtcttcg gacccgagtt cgccgtcatc 300 
gacaagtacc cgcgccgcgt gcgcctgccc gcccgcgagt acctgctcgt gacccgcgtc 360 
accctcatgg acgccgaggt caacaactac cgcgtcggcg cccgcatggt caccgagtac 420 
gatctccccg tcaacggaga gctctccgag ggcggagact gcccctgggc cgtcctggtc 480 
gagagtggcc agtgcgatct catgctcatc tcctacatgg gcattgactt ccagaaccag 640 
ggcgaccgcg tctaccgcct gctcaacacc acgctcacct tttacggcgt ggcccacgag 600 
ggcgagaccc tcgagtacga cattcgcgtc accggcttcg ccaagcgtct cgacggcggc 660 
atctccatgt tcttcttcga gtacgactgc tacgtcaacg gccgcctcct catcgagatg 720 
cgcgatggct gcgccggctt cttcaccaac gaggagctcg acgccggcaa gggcgtcgtc 780 
ttcacccgcg gcgacctcgc cgcccgcgcc aagatcccaa agcaggacgt ctccccctac 840 
gccgtcgccc cctgcctcca caagaccaag ctcaacgaaa aggagatgca gaccctcgtc 900 
gacaaggact gggcatccgt ctttggctcc aagaacggca tgccggaaat caactacaaa 960 
ctctgcgcgc gtaagatgct catgattgac cgcgtcacca gcattgacca caagggcggt 1020 
gtctacggcc tcggtcagct cgtcggtgaa aagatcctcg agcgcgacca ctggtacttt 1080 
ccctgccact ttgtcaagga tcaggtcatg gccggatccc tcgtctccga cggctgcagc 1140 
cagatgctca agatgtacat gatctggctc ggcctccacc tcaccaccgg accctttgac 1200 
ttccgcccgg tcaacggcca ccccaacaag gtccgctgcc gcggccaaat ctccccgcac 1260 
aagggcaagc tcgtctacgt catggagatc aaggagatgg gcttcgacga ggacaacgac 1320 
ccgtacgcca ttgccgacgt caacatcatt gatgtcgact tcgaaaaggg ccaggacttt 1380 
agcctcgacc gcatcagcga ctacggcaag ggcgacctca acaagaagat cgtcgtcgac 14 40 
tttaagggca tcgctctcaa gatgcagaag cgctccacca acaagaaccc ctccaaggtt 1500 
cagcccgtct ttgccaacgg cgccgccact gtcggccccg aggcctccaa ggcttcctcc 1560 
ggcgccagcg ccagcgccag cgccgccccg gccaagcctg ccttcagcgc cgatgttctt 1620 
gcgcccaagc ccgttgccct tcccgagcac atcctcaagg gcgacgccct cgcccccaag 1680 
gagatgtcct ggcaccccat ggcccgcatc ccgggcaacc cgacgccctc ttttgcgccc 17 40 
tcggcctaca agccgcgcaa catcgccttt acgcccttcc ccggcaaccc caacgataac 1800 
gaccacaccc cgggcaagat gccgctcacc tggttcaaca tggccgagtt catggccggc 18 60 
aaggtcagca tgtgcctcgg ccccgagttc gccaagttcg acgactcgaa caccagccgc 1920 
agccccgctt gggacctcgc tctcgtcacc cgcgccgtgt ctgtgtctga cctcaagcac 1980 
gtcaactacc gcaacatcga cctcgacccc tccaagggta ccatggtcgg cgagttcgac 2040 
tgccccgcgg acgcctggtt ctacaagggc gcctgcaacg atgcccacat gccgtactcg 2100 
atcctcatgg agatcgccct ccagacctcg ggtgtgctca cctcggtgct caaggcgccc 2160 
ctgaccatgg agaaggacga catcctcttc cgcaacctcg acgccaacgc cgagttcgtg 2220 
cgcgccgacc tcgactaccg cggcaagact atccgcaacg tcaccaagtg cactggctac 2280 
agcatgctcg gcgagatggg cgtccaccgc ttcacctttg agctctacgt cgatgatgtg 2340 
ctcttttaca agggctcgac ctcgttcggc tggttcgtgc ccgaggtytt tgccgcccag 2400 
gccggcctcg acaacggccg caagtcggag ccctggttca ttgagaacaa ggttccggcc 24 60 
tcgcaggtct cctcctttga cgtgcgcccc aacggcagcg gccgcaccgc catcttcgcc 2520 
aacgccccca gcggcgccca gctcaaccgc cgcacggacc agggccagta cctcgacgcc 2580 
gtcgacattg tctccggcag cggcaagaag agcctcggct acgcccacgg ttccaagacg 2640 
gtcaacccga acgactggtt cttctcgtgc cacttttggt ttgactcggt catgcccgga 2700 
agtctcggtg tcgagtccat gttccagctc gtcgaggcca tcgccgccca cgaggatctc 27 60 
gctggcaaag cacggcattg ccaaccccac ctttgtgcac gcccccgggc aagatcaagc 2820 
tggaagtacc gcggscagct cacgcccaag agcaagaaga tggactcgga ggtccacatc 2880 
gtgtccgtgg acgcccacga cggcgttgtc gacctcgtcg ccgacggctt cctctgggcc 2940 
gacagcctcc gcgtccactc ggtgagcaac attcgcgtgc gcatcgcctc cggtgaggcc 3000 
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cctgccgccg cctcctccgc cgcctctgtg ggctcctcgg cttcgtccgt cgagcgcacg 3060 
cgctcgagcc ccgctgtcgc ctccggcccg gcccagacca tcgacctcaa gcagctcaag 3120 
accgagctcc tcgagctcga tgccccgctc tacctctcgc aggacccgac cagcggccag 3180 
ctcaagaagc acaccgacgt ggcctccggc caggccacca tcgtgcagcc ctgcacgctc 3240 
ggcgacctcg gtgaccgctc cttcatggag acctacggcg tcgtcgcccc gctgtacacg 3300 
ggcgccatgg ccaagggcat tgcctcggcg gacctcgtca tcgccgccgg caagcgcaag 3360 
atcctcggct cctttggcgc cggcggcctc cccatgcacc acgtgcgcgc cgccctcgag 3420 
aagatccagg ccgccctgcc tcagggcccc tacgccgtca acctcatcca ctcgcctttt 3480 
gacagcaacc tcgagaaggg caacgtcgat ctcttcctcg agaagggcgt cactgtggtg 3540 
gaggcctcgg cattcatgac cctcaccccg caggtcgtgc gctaccgcgc cgccggcctc 3600 
tcgcgcaacg ccgacggttc ggtcaacatc cgcaaccgca tcatcggcaa ggrctcgcgc 3660 
accgagctcg ccgagatgtt catccgcccg gccccggagc acctcctcga gaagctcatc 3720 
gcctcgggcg agatcaccca ggagcaggcc gagctcgcgc gccgcgttcc cgtcgccgac 37 80 
gatatcgctg tcgaggctga ctcgggcggc cacaccgaca accgccccat ccacgtcatc 38 40 
ctcccgctca tcatcaacct ccgcaaccgc ctgcaccgcg agtgcggcta ccccgcgcac 3900 
ctccgcgtcc gcgttggcgc cggcggtggc gtcggctgcc cgcaggccgc cgccgccgcg 3960 
ctcaccatgg gcgccgcctt catcgtcacc ggcactgtca accaggtcgc caagcagtcc 4020 
ggcacctgcg acaacgtgcg caagcagctc tcgcaggcca cctactcgga tatctgcatg 4080 
gccccggccg ccgacatgtt cgaggagggc gtcaagctcc aggtcctcaa gaagggaacc 4140 
atgttcccct cgcgcgccaa caagctctac gagctctttt gcaagtacga ctccttcgac 4200 
tccatgcctc ctgccgagct cgagcgcatc gagaagcgta tcttcaagcg cgcactccag 42 60 
gaggtctggg aggagaccaa ggacttttac antaacggtc tcaagaaccc ggagaagatc 4320 
cagcgcgccg agcacgaccc caagctcaag atgtcgctct gcttccgctg gtaccttggt 4380 
ctfgccagcc gctgggccaa catgggcgcc ccggaccgcg tcatggacta ccaggtctgg 4 440 
tgtggcccgg ccattggcgc cttcaacgac ttcatcaagg gcacctacct cgaccccgct 4500 
gtctccaacg agtacccctg tgtcgtccag atcaacctgc aaatcctccg tggtgcctgc 4560 
tacctgcgcc gtctcaacgc cctgcgcaac gacccgcgca ttgacctcga gaccgaggat 4620 
gctgcctttg tctacgagcc caccaacgcg ctctaagaaa gtgaaccttg tcctaacccg 4680 
acagcgaatg gcgggagggg gcgggctaaa agatcgtatt acatagtatt tttcccctac 4740 
tctttgtgaa aaaaaaaaaa aaaaaaa 4767 



<2l6> 77 
<211> 7959 
<212> DNA 

<213> Vibrio marinus 



<400> 77 

atggctaaaa agaacaccac atcgattaag 
caacagttaa attctcgctt gcaagaatgt 
tttgcagatg ctaaaaactt ggatcaattc 
attattgatg tgcctagcga tcgctggaac 
gcagctgaca agacatactg caaacgcggt 
atggagtttg gtttaccgcc aaatatcctc 
ttaattgttg ctcgtgatgt attaagtgat 
aaaattggta tcacgctggg tgtcggtggt 
cgcctacaag gcccggtatt agaaaaagta 
cgcgctatga tcatcgacaa atttaaaaaa 
ccaggcatgc taggtaacgt tattgctggt 



cacgccaagg atgtgttaag tagtgatgat 60 
ccgattgcca tcattggtat ggcatcggtt 120 
tgggataaca tcgttgactc tgtggacgct 180 
attgacgacc attactcggc tgataaaaaa 240 
ggtttcattc cagagcttga ttttgatccg 300 
gagttaactg acatcgctca attgttgtca 360 
gctggcattg gtagtgatta tgaccatgat 420 
ggtcagaaac aaatttcgcc attaacgtcg 480 
ttaaaagcct caggcattga tgaagatgat 540 
gcctacatcg gctgggaaga gaactcattc 600 
cgtatcgcca atcgttttga ttttggtggt 660 
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actaactgtg tggttgatgc ggcatgcgct ggctcccttg cagctgttaa aatggcgatc 720 
tcagacttac ttgaat.atcg ttcagaagtc atgatatcgg gtggtgtatg ttgtgataac 780 
tcgccattca tgtatatgtc attctcgaaa acaccagcat ttaccaccaa tgatgatatc 840 
cgtccgtttg atgacgattc aaaaggcatg ctggttggtg aaggtattgg catgatggcg 900 
tttaaacgtc ttgaagatgc tgaacgtgac ggcgacaaaa tttattctgt actgaaaggt 960 
atcggtacat cttcagatgg tcgtttcaaa tctatttacg ctccacgccc agatggccaa 1020 
gcaaaagcgc taaaacgtgc ttatgaagat gccggttttg cccctgaaac atgtggtcta 1080 
attgaaggcc atggtacggg taccaaagcg ggtgatgccg cagaatttgc tggcttgacc 1140 
aaacactttg gcgccgccag tgatgaaaag caatatatcg ccttaggctc agttaaatcg 1200 
caaattggtc atactaaatc tgcggctggc tctgcgggta tgattaaggc ggcattagcg 1260 
ctgcatcata aaatcttacc tgcaacgatc catatcgata aaccaagtga agccttggat 1320 
atcaaaaaca gcccgttata cctaaacagc gaaacgcgtc cttggatgcc acgtgaagat 1380 
ggtattccac gtcgtgcagg tatcagctca tttggttttg gcggcaccaa cttccatatt 1440 
attttagaag agtatcgccc aggtcacgat agcgcatatc gcttaaactc agtgagccaa 1500 
actgtgttga tctcggcaaa cgaccaacaa ggtattgttg ctgagttaaa taactggcgt 1560 
actaaactgg ctgtcgatgc tgatcatcaa gggtttgtat ttaatgagtt agtgacaacg 1620 
tggccattaa aaaccccatc cgttaaccaa gctcgtttag gttttgttgc gcgtaatgca 1680 
aatgaagcga tcgcgatgat tgatacggca ttgaaacaat tcaatgcgaa cgcagataaa 1740 
atgacatggt cagtacctac cggggtttac tatcgtcaag ccggtattga tgcaacaggt 1800 
aaagtggttg cgctattctc agggcaaggt tcgcaatacg tgaacatggg tcgtgaatta 1860 
acctgtaact tcccaagcat gatgcacagt gctgcggcga tggataaaga gttcagtgcc 1920 
gctggtttag gccagttatc tgcagttaot ttccctatcc ctgtttatac ggatgccgag 1980 
cgtaagctac aagaagagca attacgttta acgcaacatg cgcaaccagc gattggtagt 2040 
ttgagtgttg gtctgttcaa aacgtttaag caagcaggtt ttaaagctga ttttgctgcc 2100 
ggtcatagtt tcggtgagtt aaccgcatta tgggctgccg atgtattgag cgaaagcgat 2160 
tacatgatgt tagcgcgtag tcgtggtcaa gcaatggctg cgccagagca acaagatttt 2220 
gatgcaggta agatggccgc tgttgttggt gatccaaagc aagtcgctgt gatcattgat 2280 
acccttgatg atgtctctat tgctaacttc aactcgaata accaagttgt tattgctggt 2340 
actacggagc aggttgctgt agcggttaca accttaggta atgctggttt caaagttgtg 2400 
ccactgccgg tatctgctgc gttccaCaca cctttagttc gtcacgcgca aaaaccattt 2460 
gctaaagcgg ttgatagcgc taaatttaaa gcgccaagca ttccagtgtt tgctaatggc 2520 
acaggcttgg tgcattcaag caaaccgaat gacattaaga aaaacctgaa aaaccacatg 2580 
ctggaatctg ttcatttcaa tcaagaaatt gacaacatct atgctgatgg tggccgcgta 2640 
tttatcgaac ttggtccaaa gaatgtatta actaaattgg ttgaaaacat tctcactgaa 2700 
aaatctgatg tgactgctat cgcggttaat gctaatccta aacaacctgc ggacgtacaa 2760 
atgcgccaag ctgcgctgca aatggcagtg cttggtgtcg cattagacaa tattgacccg 2820 
tacgacgccg ttaagcgtcc acttgttgcg ccgaaagcat caccaatgtt gatgaagtta 2880 
tctgcagcgt cttatgttag tccgaaaacg aagaaagcgt ttgctgatgc attgactgat 2940 
ggctggactg ttaagcaagc gaaagctgta cctgctgttg tgtcacaacc acaagtgatt 3000 
gaaaagatcg ttgaagttga aaagatagtt gaacgcattg tcgaagtaga gcgtattgtc 3060 
gaagtagaaa aaatcgtcta cgttaatgct gacggttcgc ttatatcgca aaataatcaa 3120 
gacgttaaca gcgctgttgt tagcaacgtg actaatagct cagtgactca tagcagtgat 3180 
gctgaccttg ttgcctctat tgaacgcagt gttggtcaat ttgttgcaca ccaacagcaa 3240 
ttattaaatg tacatgaaca gtttatgcaa ggtccacaag actacgcgaa aacagtgcag 3300 
aacgtacttg ctgcgcagac gagcaatgaa ttaccggaaa gtttagaccg tacattgtct 3360 
atgtataacg agttccaatc agaaacgcta cgtgtacatg aaacgtacct gaacaatcag 3420 
acgagcaaca tgaacaccat gcttactggt gctgaagctg atgtgctagc aaccccaata 3480 
actcaggtag tgaatacagc cgtcgccact agtcacaagg tagttgctcc agctattgct 3540 
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==.rarrat atctaqtgtc agtaataacg cggcggttgc agtgcaaacC 3600 
aatacagtga cgaatgrtgt -^J^^^ ^ J ctactacgcc agcacccgca 3660 

gtggcattag cgcctacgca -^aaatcg^ J .tgctacaga agttgcacca 3720 

..,,..gc.a .cgtggc.ga a- g^^g^g . ^^^^^^^^^^ ^^^^^^^^^^ ^^^^ 

attacaccat cagttacacc ^Jttg 9 atccaacgga tatgctggaa 3840 

:™: ::::::::: a.cgac.aa .aaacg.. .gaga.a.a 3.0 

ccgagcatgg ccctaactta cctgaactta atcctgaaga tcttgctgag 3960 

ggcgcagtac aggaat g ga g^^^^^^ ^^^^^^^^^^ ^^^^ 

:::::::::: ::::::::: ::caLg.c .cg=c.gca. .gccgg^a. .ga...agcc .oso 

cacltccaaa acgtaatgtt agaagtggtt gcagacaaaa ccggttaccc aacagacatg 4140 
c agaactga gcatggatat ggaagctgac ttaggtattg attcaatcaa gcgtgtggaa 20 
atc!tagg!g cagtacagga gatcataact gatttacctg agctaaaccc tgaagatctt 42 0 
g : a Lc gcaccctagg tgaaatcgtt agttacatgc aaagcaaagc gccagtcgc 
gaaagtgcgc cagtggcgac ggctcctgta gcaacaagct cagcaccgtc tatcgatttg 4380 
aaccacattc aaacagtgat gatggatgta gttgcagata agactggtta tccaactgac 44 
atgctagaac tcggcatgga catggaagct gatttaggta tcgattcaat caaacgtgtg 50 
gaaatattag gcgcagtgca ggagatcatc actgatttac ctgagctaaa cccagaagac 5 
ctcgctgaat tacgcacgct aggtgaaatc gttagttaca tgcaaagcaa agcgccagtc 4620 
gctgaglgtg cgccagtagc gacggcttct gtagcaacaa gctctgcacc gtctatcgat 680 

aaaccat! tccaaacagt gatgatggaa gtggttgcag acaaaaccgg ttatccagta 74 
gacatgttag aacttgctat ggacatggaa gctgacctag gtatcgattc -tcaagcg. 800 
gta^alattt taggtgcggt acaggaaatc attactgact tacctgagct taaccctgaa 8 0 
gatcttgctg aactacgtac attaggtgaa atcgttagtt acatgcaaag caaagcgccc 920 
gtagctjaag cgcctgcagt acctgttgca gtagaaagtg cacctactag tg-acaagc 9 0 
Lagcaccgt ctatcgattt agaccacatc caaaatgtaa tgatggatgt t^ttgctgat 04 
aagactggtt atcctgccaa tatgcttgaa ttagcaatgg acatggaagc cgaccttggt 100 
attgattcaa tcaagcgtgt tgaaattcta ggcgcggtac aggagatcat -ctgattta 1 
cccgaactaa acccagaaga cttagctgaa ctacgtacgt tagaagaaat «^^-acctac 220 
atgcaaagca aggcgagtgg tgttactgta aatgtagtgg ctagccctga aaataatgct 2 0 
gtatcagatg catttatgca aagcaatgtg gcgactatca cagcggccgc ^^aacataag 5340 
gcggaattta aaccggcgcc gagcgcaacc gttgctatct ctcgtctaag c^tctatcagt 00 
aaaataagcc aagattgtaa aggtgctaac gccttaatcg tagctgatgg cactgataat 4 0 
gctgtgttac ttgcagacca cctattgcaa actggctgga atgtaactgc attgcaacca 5 0 
acttgggtag ctgtaacaac gacgaaagca tttaataagt cagtgaacct ggtgacttta 5580 
aatggcgttg atgaaactga aatcaacaac attattactg ctaacgcaca attggatgca .640 
gttatctatc tgcacgcaag tagcgaaatt aatgctatcg aatacccaca agcatctaag 570 
caaggcctga tgttagcctt cttattagcg aaattgagta aagtaactca agccgctaaa 5760 
gtgcgtggcg cctttatgat tgttactcag cagggtggtt cattaggttt tgatgatatc 5820 
gattctgcta caagtcatga tgtgaaaaca gacctagtac aaagcggctt aaacggttta 58 0 
gttaagacac tgtctcacga gtgggataac gtattctgtc gtgcggttga tattgcttcg 5940 
tcattaacgg ctgaacaagt tgcaagcctt gttagtgatg aactacttga tgctaacact 6000 
gtattaacag aagtgggtta tcaacaagct ggtaaaggcc ttgaacgtat cacgttaact 6060 
ggtgtggcta ctgacagcta tgcattaaca gctggcaata acatcgatgc taactcggta 6120 
tttttagtga gtggtggcgc aaaaggtgta actgcacatt gtgttgctcg tatagctaaa 6180 
gaatatcagt ctaagttcat cttattggga cgttcaacgt tctcaagtga cgaaccgagc 6240 
tgggcaagtg gtattactga tgaagcggcg ttaaagaaag cagcgatgca gtctttgatt 6300 
acagcaggtg ataaaccaac acccgttaag atcgtacagc taatcaaacc aatccaagct 6 60 
aatcgtgaaa ttgcgcaaac cttgtctgca attaccgctg ctggtggcca agctgaatat 6420 
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gtttccgcag atgraactaa tgcagcaagc gtacaaatgg cag.cgctcc agctatcgct 6480 
aagttcggrg caatcactgg catcattcat ggcgcgggtg tgttagctga ccaattcatt 6540 
gagcaaaaaa cactgagtga ttttgagtct gtttacagca ctaaaattga cggtttgtta 6600 
tcgctactat cagtcactga agcaagcaac atcaagcaat tggtattgtt ctcgtcagcg 6660 
gctggtttct acggtaaccc cggccagtct gattactcga ttgccaatga gatcttaaat 6720 
aaaaccgcat accgctttaa atcattgcac ccacaagctc aagtattgag ctttaactgg 6780 
ggtccttggg acggtggcat ggtaacgcct gagcttaaac gtatgtttga ccaacgtggt 6840 
gtttacatta ttccacttga tgcaggtgca cagttattgc tgaatgaact agccgctaat 6900 
gataaccgtt gtccacaaat cctcgtgggt aatgacttat ctaaagatgc tagctctgat 6960 
caaaagtctg atgaaaagag tactgctgta aaaaagccac aagttagtcg tttatcagat 7020 
gctttagtaa ctaaaagtat caaagcgact aacagtagct ctttatcaaa caagactagt 7080 
gctttatcag acagtagtgc ttttcaggtt aacgaaaacc actttttagc tgaccacatg 7140 
atcaaaggca atcaggtatt accaacggta tgcgcgattg cttggatgag tgatgcagca 7200 
aaagcgactt atagtaaccg agactgtgca ttgaagtatg tcggtttcga agactataaa 7260 
ttgtttaaag gtgtggtttt tgatggcaat gaggcggcgg attaccaaat ccaattgtcg 7320 
cctgtgacaa gggcgtcaga acaggattct gaagtccgta ttgccgcaaa gatctttagc 7380 
ctgaaaagtg acggtaaacc tgtgtttcat tatgcagcga caatattgtt agcaactcag 7440 
ccacttaatg ctgtgaaggt agaacttccg acattgacag aaagtgttga tagcaacaat 7500 
aaagtaactg atgaagcaca agcgttatac agcaatggca ccttgttcca cggtgaaagt 7560 
ctgcagggca ttaagcagat attaagttgt gacgacaagg gcctgctatt ggcttgtcag 7 620 
ataaccgatg ttgcaacagc taagcaggga tccttcccgt tagctgacaa caatatcttt 7 680 
gccaatgatt tggtttatca ggctatgttg gtctgggtgc gcaaacaatt tggtttaggt 7740 
agcttacctt cggtgacaac ggcttggact gtgtatcgtg aagtggttgt agatgaagta 7800 
ttttatctgc aacttaatgt tgttgagcat gatctattgg gttcacgcgg cagtaaagcc 78 60 
cgttgtgata ttcaattgat tgctgctgat atgcaattac ttgccgaagt gaaatcagcg 7920 
caagtcagtg tcagtgacat tttgaacgat atgtcatga ''^Sg 

<210> 78 
<2H> 2652 
<212> DNA 

<213> Vibrio marinus 
<400> 78 

atgacggaat tagctgttat tggtatggat gctaaattta gcggacaaga caatattgac 60 
cgtgtggaac gcgctttcta tgaaggtgct tatgtaggta atgttagccg cgttagtacc.120 
gaatctaatg ttattagcaa tggcgaagaa caagttatta ctgccatgac agttcttaac 180 
tctgtcagtc tactagcgca aacgaatcag ttaaatatag ctgatatcgc ggtgttgctg 240 
attgctgatg taaaaagtgc tgatgatcag cttgtagtcc aaattgcatc agcaattgaa 300 
aaacagtgtg cgagttgtgt tgttattgct gatttaggcc aagcattaaa tcaagtagct 360 
gatttagtta ataaccaaga ctgtcctgtg gctgtaattg gcatgaataa ctcggttaat 420 
ttatctcgtc atgatcttga atctgtaact gcaacaatca gctttgatga aaccttcaat 4 80 
ggttataaca atgtagctgg gttcgcgagt ttacttatcg cttcaactgc gtttgccaat 540 
gctaagcaat gttatatata cgccaacatt aagggcttcg ctcaatcggg cgtaaatgct 600 
caatttaacg ttggaaacat tagcgatact gcaaagaccg cattgcagca agctagcata 660 
actgcagagc aggttggttt gttagaagtg tcagcagtcg ctgattcggc aatcgcattg 720 
tctgaaagcc aaggtttaat gtctgcttat catcatacgc aaactttgca tactgcatta 780 
agcagtgccc gtagtgtgac tggtgaaggc gggtgttttt cacaggtcgc aggtttattg 840 
aaatgtgtaa ttggtttaca tcaacgttat attccggcga ttaaagattg gcaacaaccg 900 
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^^^t rcaccattct atatgcctgt agatgctcga 960 
agtgacaatc aaatgtcacg gtggcggaat rcaccattct a g y ^^.^^^tact 1020 

rr.::: :::::::::: :::::::::: ™» 

gacagctatt g cttaactgaa agcaagcttc agactcttga acaaaacaat 1140 

a<-acitcttac aagataatga cttacn-uyoa «y lonn 
r^^cL atltgcgcac taatggttac tttgcatcga gcgagttagc attaatcata 1200 
ccagtagctg -"^^'^^cac tgtgaattag aaactattac agggcagtta 1260 

::::::!!:; iLga^cg cagcaga..g ..a.gcccg. 1320 

aargrtaca acaaagccta tagcgcagtg cttattgccg agactgctga agagttaagc 

Zl ccttqgcgtt tgctggtatc gctagcgtgt ttaargaaga tgctaaagaa 1440 
::::::: gra rgfag .l.ll^co ,..o.,.o., gg^gc^aac ISOO 

tggaaaaccc ^^ JJl J ,,,catgtac ccaggtattg gtgctacata tgttggttta 1560 
agcacacaga -^^9^^^- ,,,,,c!cag atttatcagc ctgtagcggc tttagccgat 1620 
a -i:: aga.ac...:.c..aa.ccac gcag.a..ag .cg.ca.agc 1630 
ft aaag!ac tcaagcagtt ggatctggac ctgcgcggta acttagccaa -.cgc gaa . 

attttgcttg tgtgtttacc aaggtatttg aagaagtctt tgccgttaaa 1800 

- r^^' -^'-^^^^ ^^^^^^-^'^ ^^^^^^^^^ ;:ro 

rggcagcaa! cgggattgat gagtgctcgc cttgcacaat cgaatacctt taatcatcaa 
ctttgcggcg agttaagaac actacgtcag cat.ggggca tggatgatgt agc.aacggt 98 
acgttcgagc agatctggga aacctatacc attaaggcaa cgattgaaca ggtcg a^t^^ 
gcLctgcag a.gaagatcg tgtgtattgc accattatca atacacctga tagcttgttg 2100 
! agccggtt a/ccagaagc ctgtcagcga gtcattaaga att.aggtgt gcg.gcaat 
gca-Itg!ata tggcgaacgc aattcacagc gcgccagctt atgccgaata cga.catatg 2220 
gttgagctat accatatgga tgttactcca cgtattaata ccaagatgta .tcaag tea 22 
LtLIttac cgat.ccaca acgcagcaaa gcgatttccc acagtattgc taaatgtttg 2340 
Tgcgatgtgg tgga.ttccc acgtttggt. aataccttac atgacaaagg tgcgcg.^^^^^^ 
tL!ttgaaa tgggtccagg tcgttcgtta tgragctggg tagataagat cttagttaat 2460 
ggcgatggcg afaataaaaa gcaaagccaa catgtatotg ttcctgtgaa .gccaaaggc 2 
rcclgtgatg aac.cactta tattcgtgcg attgctaagt taattagtca tggcgtgaat 25 0 
Tgaatt^a: a.agcttgtt taacgggtca atcctggtta aagcaggcca tacagcaaac 2640 
acgaacaaat ag 



<210> 79 
<211> 6057 
<212> DNA 

<213> Vibrio marinus 



atggatttaa agagagtaat tatggaaaat attgcagtag taggtattgc taat tgttc 0 
ccgggctcac aagcaccgga tcaattttgg cagcaa.tgc ttgaacaaca agat.gccg 0 
ag!!aggcga ccgctgttca aatgggcgtt gatcctgcta aatataccgc caacaaaggt 180 
gacacagaL aat.t.actg tgtgcacggc ggttacatca gtgatttcaa ttttgatgct 2 0 
tcaggtta.c aactcgataa tgattattta gccggtttag atgaccttaa ccaatggggg 300 
ctttatgtta cgaaacaagc ccttaccgat gcgggttatt ggggcagtac tgoactagaa 3 0 
aactgtggtg tgattttagg taatttgtca ttcccaacta aatcatctaa tcagctgttt 420 
atgccttcgt atcatcaagt tgttgataat gccttaaagg cggtattaca tcctgatttt 480 
caattaacgc attacacagc accgaaaaaa acacatgctg acaatgcatt ag.agcaggt 540 
tatccagctg cattgatcgc gcaagcggcg ggtcttggtg gttcacattt tgcactggat 600 
gcggcttgtg ct.catcttg ttacagcgtt aagttagcgt gtgattacct gcatacgggt 660 
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aaagccaaca tgatgctcgc tggtgcggta tctgcagcag atcctatgtt cgtaaatatg 720 
ggtttctcga tattccaagc ttacccagct aacaatgtac atgccccgtt tgaccaaaa 0 
cacaaggcc catttgccgg .gaaggcgcg ggcatgatgg tattgaaacg tcaaag gat 8 0 
agtac tg atggtgacca tatttacgcc attattaaag gcggcgcatt atcgaatga 
ggtaaaggcg agtttgtatt aagcccgaac accaagggcc aagtattagt atatgaacgt 0 
gcttatgccg atgcagatgt tgacccgagt acagttgact atattgaatg tcatgcaacg 00 
g cacaccta agggtgacaa tgttgaattg cgttcgatgg aaaccttttt cagtcgcgta 
aataacaaac cattactggg ctcggttaaa tctaaccttg gtcatttgtt aac gccgct 40 
ggtatgcctg gcatgaccaa agctatgtta gcgctaggta aaggtcttat tcctgcaacg 200 
!!taacttaa agcaaccact gcaatctaaa aacggttact ttactggcga ^caaatgcca 2 0 
acgacgactg tgtcttggcc aacaactccg ggtgccaagg cagataaacc gcgtaccgca 320 
ggtgtgagcg tatttggttt tggtggcagc aacgcccatt tggtattaca acagccaacg 380 
Taalcactcg agactaattt tagtgttgct. aaaccacgtg agcctttggc tattattggt 440 
: a.tttggtag tgccagtaat tcagcgcagt tcaaaacctt attaaataat 

aat!aaaata ccttccgtga attaccagaa caacgc.gga aaggcatgga aagtaacgct 560 
aacgtcatgc agtcgttaca attacgcaaa gcgcctaaag gcagttacgt tgaacagcta 62 
gatlttgatt tcttgcgttt taaagtaccg cctaatgaaa aagattgctt gatcccgcaa 0 
cagttaatga tgatgcaagt ggcagacaat gctgcgaaag acggaggtct ^^ttgaaggt 740 
cgtaatgttg cggtattagt agcgatgggc atggaactgg aattacatca gtatcgtggt 00 
cgcgttaatc taaccaccca aatcgaagac agcttattac agcaaggtat taacctgact 8 0 
gttgagcaac g.gaagaact gaccaatatt gctaaagacg gtgttgcctc ^gctgcacag 920 
ctalatcag. atacgagt.t cattggtaat attatggcgt cacgtatttc ggcgttatgg 980 
gatttttctg gtcctgctat taccgtatcg gc.gaagaaa actctgttta tcgttgtgtt 040 
gaattagctg aaaatctatt tcaaaccagt gatgttgaag ccgttattat -^o^^ctgtt 00 
gatttgLtg gttcaattga aaacattact ttacgtcagc actacggtcc -^"aatgaa 0 
!agggatctg taagtgaatg tggtccggtt aatgaaagca gttcagtaac caacaatatt 2220 
ctfgatcagc aacaatggct ggtgggtgaa ggcgcagcgg ctattgtcgt taaaccgtca 2280 
tcgcaagtca ctgccgagca agtttatgcg cgtattgatg cggtgagttt tgcccctggt 2 40 
agcaatgcga aagcaattac gattgcagcg gataaagcat taacacttgc tg^tatcagt 2400 
gctgctgatg tagctagtgt tgaagcacat gcaagtggtt ttagtgccga aaataatgct 24 0 
gaaaaaaccg cgttaccgac tttataccca agcgcaagta tcagttcggt ^-agccaat 25 0 
attggtcata cgtttaatgc ctcgggtatg gcgagtatta ttaaaacggc gctgctgtta 2580 
gatcagaata cgagtcaaga tcagaaaagc aaaca.attg ctattaacgg --^^-^^ ° 
gataacagct gcgcgcatct tatcttatcg agttcagcgc aagcgcatca agttgcacca 2700 
gcgcctgtat ctggtatggc caagcaacgc ccacagttag ttaaaaccat ° 
ggtcagttaa ttagcaacgc gat.gttaac agtgcgagtt catctttaca cgc attaaa 8 0 
gcgcagtctg ccggtaagca cttaaacaaa gttaaccagc cagtgatgat ^^-"-"J ^8 0 
aagccccaag gtattagcgc tcatgcaacc aatgagtatg tggtgactgg agctgctaac 2940 
actcaagctt ctaacattca agcatctcat gttcaagcgt caagtcatgc -aagagata 3000 
gcaccaaacc aagttcaaaa tatgcaagct acagcagccg ctgtaagttc acccctttct 3060 
caacatcaac acacagcgca gcccgtagcg gcaccgagcg ttgttggagt gactgtgaaa 3120 
cataaagcaa gtaaccaaat tcatcagcaa gcgtctacgc ataaagcatt tttagaaagt 3 0 
cgtttagctg cacagaaaaa cctatcgcaa cttgttgaat tgcaaaccaa gctgtcaat 
caaactggta gtgacaatac atctaacaat actgcgtcaa caagcaatac agtgctaaca 3300 
aatcctg at cagcaacgcc attaacactt gtgtctaatg cgcctgtagt agcgacaaac 3 
ctaaccagta cagaagcaaa agcgcaagca gctgctacac aagctggttt tcagataaaa 3 
ggacctgttg gttacaacta tccaccgctg cagttaattg aacgttataa taaaccagaa 0 
aacgtgattt acgatcaagc tgatttggtt gaattcgctg aaggtgatat tggtaaggta 3540 
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tttggtgctg aatacaatat tattgatggc tattcgcgtc gtgtacgtct gccaacctca 3600 
gattacttgt tagtaacacg tgttactgaa cttgatgcca aggtgcatga atacaagaaa 3660 
tcatacatgt gtactgaata tgatgtgcct gttgatgcac cgttcttaat tgatggtcag 3720 
atcccttggt ctgttgccgt cgaatcaggc cagtgtgatt tgatgttgat ttcatatatc 3780 
ggtattgatt tccaagcgaa aggcgaacgt gtttaccgtt tacttgattg tgaattaact 3840 
ttccttgaag agatggcttt tggtggcgat actttacgtt acgagatcca cattgattcg 3900 
tatgcacgta acggcgagca attattattc ttcttccatt acgattgtta cgtaggggat 3960 
aagaaggtac ttatcatgcg taatggttgt gctggtttct ttactgacga agaactttct 4020 
gatggtaaag gcgttattca taacgacaaa gacaaagctg agtttagcaa tgctgttaaa 4080 
tcatcattca cgccgttatt acaacataac cgtggtcaat acgattataa cgacatgatg 4140 
aagttggtta atggtgatgt tgccagttgt tttggtccgc aatatgatca aggtggccgt 4200 
aatccatcat tgaaattctc gtctgagaag ttcttgatga ttgaacgtat taccaagata 4260 
gacccaaccg gtggtcattg gggactaggc ctgttagaag gtcagaaaga tttagaccct 4320 
gagcattggt atttcccttg tcactttaaa ggtgatcaag taatggctgg ttcgttgatg 4380 
tcggaaggtt gtggccaaat ggcgatgttc ttcatgctgt ctcttggtat gcataccaat 4440 
gtgaacaacg ctcgtttcca accactacca ggtgaatcac aaacggtacg ttgtcgtggg 4500 
caagtactgc cacagcgcaa taccttaact taccgtatgg aagttactgc gatgggtatg 4560 
catccacagc cattcatgaa agctaatatt gatattttgc ttgacggtaa agtggttgtt 4620 
gatttcaaaa acttgagcgt gatgatcagc gaacaagatg agcattcaga ttaccctgta 4680 
acactgccga gtaatgtggc gcttaaagcg attactgcac ctgttgcgtc agtagcacca 4740 
gcatcttcac ccgctaacag cgcggatcta gacgaacgtg gtgttgaacc gtttaagttt 4800 
ccfgaacgtc cgttaatgcg tgttgagtca gacttgtctg caccgaaaag caaaggtgtg 4860 
acaccgatta agcattttga agcgcctgct gttgctggtc atcatagagt gcctaaccaa 4920 
gcaccgttta caccttggca tatgtttgag tttgcgacgg gtaatatttc taactgtttc 4980 
ggtcctgatt ttgatgttta tgaaggtcgt attccacctc gtacaccttg tggcgattta 5040 
caagttgtta ctcaggttgt agaagtgcag ggcgaacgtc ttgatcttaa aaatccatca 5100 
agctgtgtag ctgaatacta tgtaccggaa gacgcttggt actttactaa aaacagccat 5160 
gaaaactgga tgccttattc attaatcatg gaaattgcat tgcaaccaaa tggctttatt 5220 
tctggttaca tgggcacgac gcttaaatac cctgaaaaag atctgttctt ccgtaacctt 5280 
gatggtagcg gcacgttatt aaagcagatt gatttacgcg gcaagaccat tgtgaataaa 5340 
tcagtcttgg ttagtacggc tattgctggt ggcgcgatta ttcaaagttt cacgtttgat 5400 
atgtctgtag atggcgagct attttatact ggtaaagctg tatttggtta ctttagtggt 5460 
gaatcactga ctaaccaact gggcattgat aacggtaaaa cgactaatgc gtggtttgtt 5520 
gataacaata cccccgcagc gaatattgat gtgtttgatt taactaatca gtcattggct 5580 
ctgtataaag cgcctgtgga taaaccgcat tataaattgg ctggtggtca gatgaacttt .5640 
atcgatacag tgtcagtggt tgaaggcggt ggtaaagcgg gcgtggctta tgtttatggc 5700 
gaacgtacga ttgatgctga tgattggttc ttccgttatc acttccacca agatccggtg 5760 
atgccaggtt cattaggtgt tgaagctatt attgagttga tgcagaccta tgcgcttaaa 5820 
aatgatttgg gtggcaagtt tgctaaccca cgtttcattg cgccgatgac gcaagttgat 5880 
tggaaatacc gtgggcaaat tacgccgctg aataaacaga tgtcactgga cgtgcatatc 5940 
actgagatcg tgaatgacgc tggtgaagtg cgaatcgttg gtgatgcgaa tctgtctaaa 6000 
gatggtctgc gtatttatga agttaaaaac atcgttttaa gtattgttga agcgtaa 6057 

<210> 80 
<211> 1665 
<212> DNA 

<213> Vibrio marinus 
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<400> 80 

atgaatatag taagtaatca ttcggcagct acaaaaaagg aattaagaat gtcgagttta 60 
ggttttaaca ataacaacgc aattaactgg gcttggaaag tagatccagc gtcagttcat 120 
acacaagatg cagaaattaa agcagcttta atggatctaa ctaaacctct ctatgtggcg 180 
aataattcag gcgtaactgg tatagctaat catacgtcag tagcaggtgc gatcagcaat 240 
aacatcgatg ttgatgtatt ggcgtttgcg caaaagttaa acccagaaga tctgggtgat 300 
gatgcttaca agaaacagca cggcgttaaa tatgcttatc atggcggtgc gatggcaaat 360 
ggtattgcct cggttgaatt ggttgttgcg ttaggtaaag cagggctgtt atgttcattt 420 
ggtgctgcag gtctagtgcc tgatgcggtt gaagatgcaa ttcgtcgtat tcaagctgaa 480 
ttaccaaatg gcccttatgc ggttaacttg atccatgcac cagcagaaga agcattagag 540 
cgtggcgcgg ttgaacgttt cctaaaactt ggcgtcaaga cggtagaggc ttcagcttac 600 
cttggtttaa ctgaacacat tgtttggtat cgtgctgctg gtctaactaa aaacgcagat 660 
ggcagtgtta atatcggtaa caaggttatc gctaaagtat cgcgtaccga agttggtcgc 720 
cgctttatgg aacctgcacc gcaaaaatta ctggataagt tattagaaca aaataagatc 780 
acccctgaac aagctgcttt agcgttgctt gtacctatgg ctgatgatat tactggggaa 840 
gcggattctg gtggtcatac agataaccgt ccgtttttaa cattattacc gacgattatt 900 
ggtctgcgtg atgaagtgca agcgaagtat aacttctctc ctgcattacg tgttggtgct 960 
ggtggtggta tcggaacgcc tgaagcagca ctcgctgcat ttaacatggg cgcggcttat 1020 
atcgttctgg gttctgtgaa tcaggcgtgt gttgaagcgg gtgcatctga atatactcgt 1080 
aaactgttat cgacagttga aatggctgat gtgactatgg cacctgctgc agatatgttt 1140 
gaaatgggtg tgaagctgca agtattaaaa cgcggttcta tgttcgcgat gcgtgcgaag 1200 
aaattgtatg acttgtatgt ggcttatgac tcgattgaag atatcccagc tgctgaacgt 1260 
gagaagattg aaaaacaaat cttccgtgca aacctagacg agatttggga tggcactatc 1320 
gctttcttta ctgaacgcga tccagaaatg ctagcccgtg caacgagtag tcctaaacgt 1380 
aaaatggcac ttatcttccg ttggtatctt ggcctttctt cacgctggtc aaacacaggc 1440 
gagaagggac gtgaaatgga ttatcagatt tgggcaggcc caagtttagg tgcattcaac 1500 
agctgggtga aaggttctta ccttgaagac tatacccgcc gtggcgctgt agatgttgct 1560 
ttgcatatgc ttaaaggtgc tgcgtattta caacgtgtaa accagttgaa attgcaaggt 1620 
gttagcttaa gtacagaatt ggcaagttat cgtacgagtg attaa 1665 

<210> 81 
<211> 2910 
<212> DNA 

<213> Shewanella putrefaciens 
<400> 81 

atgagtatgt ttttaaattc aaaactttcg cgctcagtca aacttgccat atccgcaggc 60 
ttaacagcct cgctagctat gcctgttttt gcagaagaaa ctgctgctga agaacaaata 120 
gaaagagtcg cagtgaccgg atcgcgaatc gctaaagcag agctaactca accagctcca 180 
gtcgtcagcc tttcagccga agaactgaca aaatttggta atcaagattt aggtagcgta 240 
ctagcagaat tacctgctat tggtgcaacc aacactatta ttggtaataa caatagcaac 300 
tcaagcgcag gtgttagctc agcagacttg cgtcgtctag gtgctaacag aaccttagta 360 
ttagtcaacg gtaagcgcta cgttgccggc caaccgggct cagctgaggt agatttgtca 420 
actataccaa ctagcatgat ctcgcgagtt gagattgtaa ccggcggtgc ttcagcaatt 480 
tatggttcgg acgctgtatc aggtgttatc aacgttatcc ttaaagaaga ctttgaaggc 540 
tttgagttta acgcacgtac tagcggttct actgaaagtg taggcactca agagcactct 500 
tttgacattt tgggtggtgc aaacgttgca gatggacgtg gtaatgtaac cttctacgca 660 
ggttatgaac gtacaaaaga agtcatggct accgacattc gccaattcga tgcttgggga 720 
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acaattaaaa acgaagccga tggtggtgaa gatgatggta ttccagacag actacgtgta 780 
ccacgagttt attctgaaat gattaatgct accggtgtta tcaatgcatt tggtggtgga 840 
attggtcgct caacctttga cagtaacggc aatcctattg cacaacaaga acgtgatggg 900 
actaacagct ttgcatttgg ttcattccct aatggctgtg acacatgttt caacactgaa 960 
gcatacgaaa actatattcc aggggtagaa agaataaacg ttggctcatc attcaacttt 1020 
gattttaccg ataacattca attttacact gacttcagat atgtaaagtc agatattcag 1080 
caacaatttc agccttcatt ccgttttggt aacattaata tcaatgttga agataacgcc 1140 
tttttgaatg acgacttgcg tcagcaaatg ctcgatgcgg gtcaaaccaa tgctagtttt 1200 
gccaagtttt ttgatgaatt aggaaatcgc tcagcagaaa ataaacgcga acttttccgt 1260 
tacgtaggtg gctttaaagg tggctttgat attagcgaaa ccatatttga ttacgacctt 1320 
tactatgttt atggcgagac taataaccgt cgtaaaaccc ttaatgacct aattcctgat 1380 
aactttgtcg cagctgtcga ctctgttatt gatcctgata ctggcttagc agcgtgtcgc 1440 
tcacaagtag caagcgctca aggcgatgac . tatacagatc ccgcgtctgt aaatggtagc 1500 
gactgtgttg cttataaccc atttggcatg ggtcaagctt cagcagaagc ccgcgactgg 1560 
gtttctgctg atgtgactcg tgaagacaaa ataactcaac aagtgattgg tggtactctc 1620 
ggtaccgatt ctgaagaact atttgagctt caaggtggtg caatcgctat ggttgttggt 1680 
tttgaatacc gtgaagaaac gtctggttca acaaccgatg aatttactaa agcaggtttc 1740 
ttgacaagcg ctgcaacgcc agattcttat ggcgaatacg acgtgactga gtattttgtt 1800 
gaggtgaaca tcccagtact aaaagaatta ccttttgcac atgagttgag ctttgacggt 1860 
gcataccgta atgctgatta ctcacatgcc ggtaagactg aagcatggaa agctggtatg 1920 
ttctactcac cattagagca acttgcatta cgtggtacgg taggtgaagc agtacgagca 1980 
cca-aacattg cagaagcctt tagtccacgc tctcctggtt ttggccgcgt ttcagatcca 2040 
tgtgatgcag ataacattaa tgacgatccg gatcgcgtgt caaactgtgc agcattgggg 2100 
atccctccag gattccaagc taatgataac gtcagtgtag ataccttatc tggtggtaac 2160 
ccagatctaa aacctgaaac atcaacatcc tttacaggtg gtcttgtttg gacaccaacg 2220 
tttgctgaca atctatcatt cactgtcgat tattatgata ttcaaattga ggatgctatt 2280 
ttgtcagtag ccacccagac tgtggctgat aactgtgttg actcaactgg cggacctgac 2340 
accgacttct gtagtcaagt tgatcgtaat ccaacgacct atgatattga acttgttcgc 2400 
tctggttatc taaatgccgc ggcattgaat accaaaggta ttgaatttca agctgcatac 2460 
tcattagatc tagagtcttt caacgcgcct ggtgaactac gcttcaacct attggggaac 2520 
caattacttg aactagaacg tcttgaattc caaaatcgtc ctgatgagat taatgatgaa 2580 
aaaggcgaag taggtgatcc agagctgcag ttccgcctag gcatcgatta ccgtctagat 2640 
gatctaagtg ttagctggaa cacgcgttat attgatagcg tagtaactta tgatgtctct 2700 
gaaaatggtg gctctcctga agatttatat ccaggccaca taggctcaat gacaactcat 2760 
gacttgagcg ctacatacta catcaatgag aacttcatga ttaacggtgg tgtacgtaac ^820 
ctatttgacg cacttccacc tggatacact aacgatgcgc tatatgatct agttggtcgc 2880 

2910 

cgtgcattcc taggtattaa ggtaatgatg 



<210> 82 
<211> 864 
<212> DNA 

<213> Shewanella putrefaciens 



<400> 82 

atggcaaaaa taaatagtga acacttggat gaagctacta ttacttcgaa taagtgtacg 60 
caaacagaga ctgaggctcg gcatagaaat gccactacaa cacctgagat gcgccgattc 120 
atacaagagt cggatctcag tgttagccaa ctgtctaaaa tattaaatat cagtgaagct 180 
accgtacata agcggcgcaa gcgtgactct gtcgaaaact gtcctaatac cccgcaccat 240 
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..raacccc tttqcaagaa tatgtggttg tgggcctgcg ttatcaattg 300 
ctcaatacca ^9--- J ,,,,,,,,,, 3aacgtgtcg 360 

rarraj rg::::::: z 
:::::::::! rg:::::::; r/aiaJ .gcc... .ag.accga. 

caaacctaca J ggtgtctctc accattccac caaagttaac cgaagaagca 600 

llZZll ™ cl/tgatcct ca.agcgact ggatc.atc. cgaca.a.ac 660 
TZ. atacacaagc cacgaataga tatatggctt atgtgctaaa acacgggcca 720 
::::: : ; a gftaL cg/gcg.aac tatcacacct ttttacagcg ctttcctgga 780 
Tcgacgcaaa a.cgccgccc c.c.aaaga. atgcctgaaa caatcaacaa gacgcc.gaa 8.0 
acacaggcac ccagtggaga ctca 



<210> 83 
<211> 8268 
<212> DNA 

<213> Shewanella putrefaciens 
<400> 83 



: gccaga cc.ctaaacc tacaaactca gcaactgagc aagcacaaga c. acaagc 
aact!tcg« taaataaacg actaaaagat atgccaattg ctattgttgg catggcgagt 20 
rtttttafaa actctcgcta tttgaataag ttttgggact taatcagcga aaaaattgat 10 

H::: ::= = =: 
riis =: =" r=: ::= =: 
:= =: := =: i 

^^.^tatcaa oaaattccaa gaccaatatg tacactggga agaaaactcg 600 
a - rgtta^gcg :gccg.a.cg ccaaccgct. cga.tt.ggc 660 
ggcltgaact gtgtggttga tgctgcctgt gctggatcac ttgctgctat gcgtatggcg 20 
: aaclgagc Lac.gaagg .cgctctgaa atgatgatca ccgg.ggtgt gtg.act a 
aactcaccct ctatgtatat gagcttttca aaaacgcccg cctttaccac taacgaaacc 840 
attcagccat ttgatatcga ctcaaaaggc atgatgattg gtgaaggtat tggcatggtg 900 
tZllZ g/cttgaaga tgcagagcgc gatggcgacc gcat.tactc tgtaatt 
ggtgtgggtg catcatctga cggtaagttt aaatcaatct atgcccctcg -"^"J^ - ^ °^ 
caagctllag cacttaaccg tgcctatgat gacgcaggtt ttgcgccgca taccttaggt 0 0 
ctaattgaag ctcacggaac aggtactgca gcaggtgacg cggcagagtt tgccggcctt 40 
tgctcagcat ttgctgaagg caacga.acc aagcaacaca ttgcgctagg ttcag. aaa 20 
tcacaaattg gtcatactaa a.caactgca ggtacagcag gtttaattaa agctgctctt ^ 
gctttgcatc acaaggtact gccgccgacc attaacgtta gtcagccaag ^ ° 

gataccgaaa ac.caccgtt ttatctaaac actgagactc gtccatggtt acca gtg^^^^ 
gatggtacgc cgcgccgcgc gggtattagc tcatttggtt ttggtggcac taacttccat 440 

t Ltag aagagtacaa ccaagaacac agccgtactg atagcgaaaa agctaag a 
cgtcaacgcc aagtggcgca aagcttcctt gttagcgcaa gcgataaagc atcgctaatt 1560 
aacgagt aa acgtac.agc agcatctgca agccaagctg agtttatcct caaaga gca 
gcagcaaact atggcgtacg tgagcttgat aaaaatgcac cacggatcgg tttagt gca 10 
Talagcg aagag.agc aggcc.aa. aagcaagc 

gatgataacg catggcagct acctggtggc actagctacc gcgccgc g y v, 
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taqccaaggt tcacaatatc tcaatatggg ccgtgacctt 1860 
rrgrgn ""cagraa ..cg.aac.g ^aga^aaag. ac«gccgca a.O 
gitaaaa cgccgttatc gcaaactctg tatccaaagc ctgtatttaa taaaga ga 
ttaaaggctc aagaagccat tttgaccaat accgccaatg cccaaagcgc aattggtgcg 2 40 
atttcaatgg gtcaatacga tttgtttact gcggctggct ttaatgccga catggt gca 2 00 
gccatagct ttggtgagct aagtgcactg tgtgctgcag gtgttatttc agctgatgac 
tactacaagc tggcttttgc tcgtggtgag gctatggcaa caaaagcacc ggctaaagac 2220 
ggcgtcgaag cagatgcagg agcaatgttt gcaatcataa ccaagagtgc tgcagacctt 
!!a!ccgttg aagccaccat cgctaaattt gatggggtga aagtcgctaa ctataacgcg 2340 

acgcaa^^ cagtaattgc aggcccaaca gcaactaccg ctgatgcggc taaagcgcta 
!!tgagcttg gttacaaagc gattaacctg ccagtatcag gtgcattcca cactgaactt 2460 

g cac !tcaagcgcc atttgctaaa gcgattgacg cagccaaatt tactaaaaca 
agccgagcac tttactcaaa tgcaactggc ggactttatg aaagcactgc tgcaaagatt 25 0 
aaagccLgt ttaagaaaca tatgcttcaa tcagtgcgct ttactagcca gctagaagcc 2 4 
atgtacaacg acggcgcccg tgtatttgtt gaatttggtc caaagaacat cttacaaaaa 2700 

tagttcaag gcacgctt^t caacactgaa aatgaagttt gcactatctc tatcaaccct 27 
aatcctaaag t.gatagtga tctgcagctt aagcaagcag caatgcagct agcggttact 8 
ggtgtggtac tcag.gaaat tgacccatac caagccgata ttgccgcacc ^^^'^^^^^^ IZl 
!Lccaatga gcatttcgct taatgctgct aaccatatca gcaaagcaac tcgcgctaag 2940 
r/ggccaagt ctttagagac agg.atcgtc acc.cgcaaa tagaacatgt ta.t:gaagaa o 
aaaatcgttg aagt.gagaa actggt.gaa gtcgaaaaga tcgtcgaaaa agtggttga^^ 0 
gtagagaaag ttgttgaggt tgaagctcct gttaat.cag tgcaagccaa tgcaattcaa 3120 
Icccgttcag ttg.cgctcc agtaatagag aaccaagtcg tgtctaaaaa cag.aagcca 10 
gcagLcaga gcattagtgg tgatgcactc agcaactttt ttgctgcaca ^"9"-- ^40 
gcacagttgc a.cagcagtt cttagctatt ccgcagcaat atggtgagac ^ttcactacg 300 
!tgatgaccg agcaagctaa actggcaagt tctggtgttg caattccaga gagtctgcaa 3 0 
cgctcaatgg agcaattcca ccaactacaa gcgcaaacac tacaaagcca cacccagttc 0 
cttgagatgc aagcgggtag caacattgca gcgttaaacc tactcaatag cagccaagca 0 
actL!gctc cagccattca caatgaagcg attcaaagcc aagtggttca aagccaaact 35 0 
gcagtclagc cagtaatttc aacacaagtt aaccatgtgt cagagcagcc aactcaagct 3 00 
ccagctccaa aagcgcagcc agcacctgtg acaactgcag ttcaaactgc tccggcacaa 3 0 
gttgttcgtc aagccgcacc agttcaagcc gctattgaac cgattaatac aagtgttgcg 37 0 
!ct!caacgc cttcagcctt cagcgccgaa acagccctga gcgcaacaaa agtccaagcc 37 0 
actatgcttg aagtggttgc tgagaaaacc ggttacccaa ctgaaatgct agagcttgaa 3840 
atggaLtgg aagccgattt aggcatcgat tctatcaagc gtgtagaaat -"^^"J- ° 
gtacaagatg agctaccggg tctacctgag cttagccctg aagatctagc tgagtgtcga 3960 
acgctaggcg aaatcgttga ctatatgggc agtaaactgc cggctgaagg ctctatgaat 0 0 
tclcagctgt ctacaggttc cgcagctgcg actcctgcag cgaatggtct ttctgcggag 40 0 
aaagttcaag cgactatgat gtctgtggtt gccgaaaaga ctggctaccc aactgaaatg 140 
ctagagcttg aaatggatat ggaagccgat ttaggcatag attctatcaa gcgcgt gaa 200 
attcttggca cagtacaaga tgagctaccg ggtctacctg agcttagccc tgaagatcta 260 
gctgagtgtc gtactctagg cgaaatcgtt gactatatga actctaaact cgctgacggc 4320 
tctaagctgc cggctgaagg ctctatgaat tctcagctgt ctacaagtgc cgcagctgcg 380 
actcctgcag cgaatggtct ctctgcggag aaagttcaag cgactatgat gtctgtggtt 
gccgaaaaga ctggctaccc aactgaaatg ctagaacttg aaatggatat ggaagctgac 0 
cttggcatcg attcaatcaa gcgcgttgaa attcttggca cagtacaaga tgagctaccg 560 
ggtttacctg agctaaatcc agaagatttg gcagagtgtc gtactcttgg cgaaatcgtg 620 
Icttatatga actctaaact cgctgacggc tctaagctgc cagctgaagg ctctatgcac 4680 
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^3^.^-^gj;tgt ctacaagtac cgctgctgcg actcctgtag cgaatggtct ctctgcagaa 4740 
aaagttcaag cgaccatgat gtctgtagtt gcagataaaa ctggctaccc aactgaaatg 4800 
cttgaacttg aaatggatat ggaagccgat ttaggtatcg attctatcaa gcgcgttgaa 4860 
attcttggca cagtacaaga tgagctaccg ggtttacctg agctaaatcc agaagatcta 4920 
gcagagtgtc gcaccctagg cgaaatcgtt gactatatgg gcagtaaact gccggctgaa 4980 
ggctctgcta atacaagtgc cgctgcgtct cttaatgtta gtgccgttgc ggcgcctcaa 5040 
gctgctgcga ctcctgtatc gaacggtctc tctgcagaga aagtgcaaag cactatgatg 5100 
tcagtagttg cagaaaagac cggctaccca actgaaatgc tagaacttgg catggatatcf 5160 
gaagccgatt taggtatcga ctcaattaaa cgcgttgaga ttcttggcac agtacaagat 5220 
gagctaccgg gtctaccaga gcttaatcct gaagatttag ctgagtgccg tacgctgggc 5280 
gaaatcgttg actatatgaa ctctaagctg gctgacggct ctaagcttcc agctgaaggc 5340 
tctgctaata caagtgccac tgctgcgact cctgcagtga atggtctttc tgctgacaag 5400 
gtacaggcga ctatgatgtc tgtagttgct gaaaagaccg gctacccaac tgaaatgcta 5460 
gaacttggca tggatatgga agcagacctt ggtattgatt ctattaagcg cgttgaaatt 5520 
cttggcacag tacaagatga gctcccaggt ttacctgagc ttaatcctga agatctcgct 5580 
gagtgccgca cgcttggcga aatcgttagc tatatgaact ctcaactggc tgatggctct 5640 
aaactttcta caagtgcggc tgaaggctct gctgatacaa gtgctgcaaa tgctgcaaag 5'700 
ccggcagcaa tttcggcaga accaagtgtt gagcttcctc ctcatagcga ggtagcgcta 5760 
aaaaagctta atgcggcgaa caagctagaa aattgtttcg ccgcagacgc aagtgttgtg 5820 
attaacgatg atggtcacaa cgcaggcgtt ttagctgaga aacttattaa acaaggccta 5880 
aaagtagccg ttgtgcgttt accgaaaggt cagcctcaat cgccactttc aagcgatgtt 5940 
gctagctttg agcttgcctc aagccaagaa tctgagcttg aagccagtat cactgcagtt 6000 
atcgcgcaga ttgaaactca ggttggcgct attggtggct ttattcactt gcaaccagaa 6060 
gcgaatacag aagagcaaac ggcagtaaac ctagatgcgc aaagttttac tcacgttagc 6120 
aatgcgttct tgtgggccaa attattgcaa ccaaagctcg ttgctggagc agatgcgcgt 6180 
cgctgttttg taacagtaag ccgtatcgac ggtggctttg gttacctaaa tactgacgcc 6240 
ctaaaagatg ctgagctaaa ccaagcagca ttagctggtt taactaaaac cttaagccat 6300 
gaatggccac aagtgttctg tcgcgcgcta gatattgcaa cagatgttga tgcaacccat 6360 
cttgctgatg caatcaccag tgaactattt gatagccaag ctcagctacc tgaagtgggc 6420 
ttaagcttaa ttgatggcaa agttaaccgc gtaactctag ttgctgctga agctgcagat 6480 
aaaacagcaa aagcagagct taacagcaca gataaaatct tagtgactgg tggggcaaaa 6540 
ggggtgacat ttgaatgtgc actggcatta gcatctcgca gccagtctca ctttatctta 6600 
gctgggcgca gtgaattaca agctttacca agctgggctg agggtaagca aactagcgag 6660 
ctaaaatcag ctgcaatcgc acatattatt tctactggrc aaaagccaac gcctaagcaa 6720 
gttgaagccg ctgtgtggcc agtgcaaagc agcattgaaa ttaatgccgc cctagccgcc 6780 
tttaacaaag ttggcgcctc agctgaatac gtcagcatgg atgttaccga tagcgccgca 6840 
atcacagcag cacttaatgg tcgctcaaat gagatcaccg gtcttattca tggcgcaggt 6900 
gtactagccg acaagcatat tcaagacaag actcttgctg aacttgctaa agtttatggc 6960 
actaaagtca acggcctaaa agcgctgctc gcggcacttg agccaagcaa aattaaatta 7020 
cttgctatgt tctcatctgc agcaggtttt tacggtaata tcggccaaag cgattacgcg 7080 
atgtcgaacg atattcttaa caaggcagcg ctgcagttca ccgctcgcaa cccacaagct 7140 
aaagtcatga gctttaactg gggtccctgg gatggcggca tggttaaccc agcgcttaaa 7200 
aagatgttta ccgagcgtgg tgtgtacgtt attccactaa aagcaggtgc agagctattt 7260 
gccactcagc tattggctga aactggcgtg cagttgctca ttggtacgtc aatgcaaggt 7320 
ggcagcgaca ctaaagcaac tgagactgct tctgtaaaaa agcttaatgc gggtgaggtg 7380 
ctaagtgcat cgcatccgcg tgctggtgca caaaaaacac cactacaagc tgtcactgca 7440 
acgcgtctgt taaccccaag tgccatggtc ttcattgaag atcaccgcat tggcggtaac 7500 
agtgtgttgc caacggtatg cgccatcgac tggatgcgtg aagcggcaag cgacatgctt 7560 
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ggcgctcaag ttaaggtact tgattacaag ctattaaaag gcattgtatt tgagactgat 7620 
gagccgcaag agttaacact tgagctaacg ccagacgatt cagacgaagc tacgctacaa 7680 
gcattaatca gctgtaatgg gcgtccgcaa tacaaggcga cgcttatcag tgataatgcc 7740 
gatattaagc aacttaacaa gcagtttgat ttaagcgcta aggcgattac cacagcaaaa 7800 
gagctttata gcaacggcac cttgttccac ggtccgcgtc tacaagggat ccaatctgta 7860 
gtgcagttcg atgatcaagg cttaattgct aaagtcgctc tgcctaaggt tgaacttagc 7920 
gattgtggtg agttcttgcc gcaaacccac atgggtggca gtcaaccttt tgctgaggac 7980 
ttgctattac aagctatgct ggtttgggct cgccttaaaa ctggctcggc aagtttgcca 8040 
tcaagcattg gtgagtttac ctcataccaa ccaatggcct ttggtgaaac tggtaccata 8100 
gagcttgaag tgattaagca ■ caacaaacgc tcacttgaag cgaatgttgc gctatatcgt 8160 
gacaacggcg agttaagtgc catgtttaag tcagctaaaa tcaccattag caaaagctta 8220 
aattcagcat ttttacctgc tgtcttagca aacgacagtg aggcgaat 8268 

<210> 84 
<211> 2313 
<212> DNA 

<213> Shewanella putrefaciens 
<400> 84 

atgccgctgc gcatcgcact tatcttactg ccaacaccgc agtttgaagt taactctgtc 60 
gaccagtcag tattagccag ctatcaaaca ctgcagcctg agctaaatgc cctgcttaat 120 
agtgcgccga cacctgaaat gctcagcatc actatctcag atgatagcga tgcaaacagc 180 
tttgagtcgc agctaaatgc tgcgaccaac gcaattaaca atggctatat cgtcaagctt 240 
gctacggcaa ctcacgcttt gttaatgctg cctgcattaa aagcggcgca aatgcggatc 300 
catcctcatg cgcagcttgc cgctatgcag caagctaaat cgacgccaat gagtcaagta 360 
tctggtgagc taaagcttgg cgctaatgcg ctaagcctag ctcagactaa tgcgctgtct 420 
catgctttaa gccaagccaa gcgtaactta actgatgtca gcgtgaatga gtgttttgag 480 
aacctcaaaa gtgaacagca gttcacagag gtttattcgc ttattcagca acttgctagc 540 
cgcacccatg tgagaaaaga ggttaatcaa ggtgtggaac ttggccctaa acaagccaaa 600 
agccactatt ggtttagcga atttcaccaa aaccgtgttg ctgccatcaa ctttattaat 660 
ggccaacaag caaccagcta tgtgcttact caaggttcag gattgtbagc tgcgaaatca 720 
atgctaaacc agcaaagatt aatgtttatc ttgccgggta acagtcagca acaaataacc 780 
gcatcaataa ctcagttaat gcagcaatta gagcgtttgc aggtaactga ggttaatgag 840 
ctttctctag aatgccaact agagctgctc agcataatgt atgacaactt agtcaacgca 900 
gacaaactca ctactcgcga tagtaagccc gcttatcagg ctgtgattca agcaagctct 960 
gttagcgctg caaagcaaga gttaagcgcg cttaacgatg cactcacagc gctgtttgct 1020 
gagcaaacaa acgccacatc aacgaataaa ggcttaatcc aatacaaaac accggcgggc 1080 
agttacttaa ccctaacacc gcttggcagc aacaatgaca acgcccaagc gggtcttgct 1140 
tttgtctatc cgggtgtggg aacggtttac gccgatatgc ttaatgagct gcatcagtac 1200 
ttccctgcgc tttacgccaa acttgagcgt gaaggcgatt taaaggcgat gctacaagca 1260 
gaagatatcc atcatcttga ccctaaacat gctgcccaaa tgagcttagg tgacttagcc 1320 
attgctggcg tggggagcag ctacctgtta actcagctgc tcaccgatga gtttaatatt 1380 
aagcctaatt ttgcattagg ttactcaatg ggtgaagcat caatgtgggc aagcttaggc 1440 
gtatggcaaa acccgcatgc gctgatcagc aaaacccaaa ccgacccgct atttacttct 1500 
gctatttccg gcaaattgac cgcggttaga caagcttggc agcttgatga taccgcagcg 1560 
gaaatccagt ggaatagctt tgtggttaga agtgaagcag cgccgattga agccttgcta 1620 
aaagattacc cacacgctta cctcgcgatt attcaagggg atacctgcgt aatcgctggc 1680 
tgtgaaatcc aatgtaaagc gctacttgca gcactgggta aacgcggtat tgcagctaat 17 40 
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cgtgtaacgg cgatgcatac gcagcctgcg atgcaagagc atcaaaatgt gatggatttt 1800 
tatctgcaac cgrtaaaagc agagcttcct agtgaaataa gctttatcag cgccgctgat 18 60 
ttaactgcca agcaaacggt gagtgagcaa gcacttagca gccaagtcgt tgctcagtct 1920 
attgccgaca ccttctgcca aaccctggac tttaccgcgc tagtacatca cgcccaacat 1980 
caaggcgcta agctgtttgt tgaaattggc gcggatagac aaaactgcac cttgatagac 2040 
aagattgtta aacaagatgg tgccagcagt gtacaacatc aaccttgttg cacagtgcct 2100 
atgaacgcaa aaggtagcca agatattacc agcgtgatta aagcgcttgg ccaattaatt 2160 
agccatcagg tgccattatc ggtgcaacca tttattgatg gactcaagcg cgagctaaca 2220 
ctttgccaat tgaccagcca acagctggca gcacatgcaa atgttgacag caagtttgag 2280 
tctaaccaag accatttact tcaaggggaa gtc 2313 



<210> 85 
<211> 6012 
<212> DNA 

<213> Shewanella putrefaciens 



<400> 85 

atgtcattac cagacaatgc ttctaaccac ctttctgcca accagaaagg cgcatctcag 60 
gcaagtaaaa ccagtaagca aagcaaaatc gccattgtcg gtttagccac tctgtatcca 120 
gacgctaaaa ccccgcaaga attttggcag aatttgctgg ataaacgcga ctctcgcagc 18 0 
accttaacta acgaaaaact cggcgctaac agccaagatt atcaaggtgt gcaaggccaa 24 0 
tctgaccgtt tttattgtaa taaaggcggc tacattgaga acttcagctt taatgctgca 300 
ggctacaaat tgccggagca aagcttaaat ggcttggacg acagcttcct ttgggcgctc 360 
gatactagcc gtaacgcact aattgatgct ggtattgata tcaacggcgc tgatttaagc 42 0 
cgcgcaggtg tagtcatggg cgcgctgtcg ttcccaacta cccgctcaaa cgatctgttt 48 0 
ttgccaattt atcacagcgc cgttgaaaaa gccctgcaag ataaactagg cgtaaaggca 54 0 
tttaagctaa gcccaactaa tgctcatacc gctcgcgcgg caaatgagag cagcctaaat 600 
gcagccaatg gtgccattgc ccataacagc tcaaaagtgg tggccgatgc acttggcctt 660 
ggcggcgcac aactaagcct agatgctgcc tgtgctagtt cggtttactc attaaagctt 720 
gcctgcgatt acctaagcac tggcaaagcc gatatcatgc tagcaggcgc agtatctggc 78 0 
gcggatcctt tctttattaa tatgggattc tcaatcttcc acgcctaccc agaccatggt 840 
atctcagtac cgtttgatgc cagcagtaaa ggtttgtttg ctggcgaagg cgctggcgta 900 
ttagtgctta aacgtcttga agatgccgag cgcgacaatg acaaaatcta tgcggttgtt 960 
agcggcgtag gtctatcaaa cgacggtaaa ggccagtttg tattaagccc taatccaaaa 1020 
ggtcaggtga aggcctttga acgtgcttat gctgccagtg acattgagcc aaaagacatt X080 
gaagtgattg agtgccacgc aacaggcaca ccgcttggcg ataaaattga gctcacttca 1140 
atggaaacct tctttgaaga caagctgcaa ggcaccgatg caccgttaat tggctcagct 1200 
aagtctaact taggccacct attaactgca gcgcatgcgg ggatcatgaa gatgatcttc 12 60 
gccatgaaag aaggttacct gccgccaagt atcaatatta gtgatgctat cgcttcgccg 1320 
aaaaaactct tcggtaaacc aaccctgcct agcatggttc aaggctggcc agataagcca 1380 
tcgaataatc attttggtgt aagaacccgt cacgcaggcg tatcggtatt tggctttggt 1440 
ggctgtaacg cccatctgtt gcttgagtca tacaacggca aaggaacagt aaaggcagaa 1500 
gccactcaag taccgcgtca agctgagccg ctaaaagtgg ttggccttgc ctcgcacttt 1560 
gggcctctta gcagcattaa tgcactcaac aatgctgtga cccaagatgg gaatggcttt 1620 
atcgaactgc cgaaaaagcg ctggaaaggc cttgaaaagc acagtgaact gttagctgaa 1680 
tttggcttag catctgcgcc aaaaggtgct tatgttgata acttcgagct ggacttttta 1740 
cgctttaaac tgccgccaaa cgaagatgac cgtttgatct cacagcagct aatgctaatg 1800 
cgagtaacag acgaagccat tcgtgatgcc aagcttgagc cggggcaaaa agtagctgta 1860 
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ttagtggcaa tggaaactga gcttgaactg catcagttcc gcggccgggt taacttgcat 1920 
actcaattag cgcaaagtct tgccgccatg ggcgtgagtt tatcaacgga tgaataccaa 1980 
gcgcttgaag ccatcgccat ggacagcgtg cttgatgctg ccaagctcaa tcagtacacc 2040 
agctttattg gtaatattat ggcgtcacgc gtggcgtcac tatgggactt taatggccca 2100 
gccttcacta tttcagcagc agagcaatct gtgagccgct gtatcgatgt ggcgcaaaac 2160 
ctcatcatgg aggataacct agatgcggtg gtgattgcag cggtcgatct ctctggtagc 2220 
tttgagcaag tcattcttaa aaatgccatt gcacctgtag ccattgagcc aaacctcgaa 2280 
gcaagcctta atccaacatc agcaagctgg aatgtcggtg aaggtgctgg cgcggtcgtg 2340 
cttgttaaaa atgaagctac atcgggctgc tcatacggcc aaattgatgc acttggcttt 2400 
gctaaaactg ccgaaacagc gttggctacc gacaagctac tgagccaaac tgccacagac 2460 
tttaataagg ttaaagtgat tgaaactatg gcagcgcctg ctagccaaat tcaattagcg 2520 
ccaatagtta gctctcaagt gactcacact gctgcagagc agcgtgttgg tcactgcttt 2580 
gctgcagcgg gtatggcaag cctattacac ggcttactta acttaaatac tgtagcccaa 2640 
accaataaag ccaattgcgc gcttatcaac aatatcagtg aaaaccaatt atcacagctg 2700 
ttgattagcc aaacagcgag cgaacaacaa gcattaaccg cgcgtttaag caatgagctt 2760 
aaatccgatg ctaaacacca actggttaag caagtcacct taggtggccg tgatatctac 2820 
cagcatattg ttgatacacc gcttgcaagc cttgaaagca ttactcagaa attggcgcaa 2880 
gcgacagcat cgacagtggt caaccaagtt aaacctatta aggccgctgg ctcagtcgaa 2940 
atggctaact cattcgaaac ggaaagctca gcagagccac aaataacaat tgcagcacaa 3000 
cagactgcaa acattggcgt caccgctcag gcaaccaaac gtgaattagg taccccacca 3060 
atgacaacaa ataccattgc taatacagca aataatttag acaagactct tgagactgtt 3120 
gctggcaata ctgttgctag caaggttggc tctggcgaca tagtcaattt tcaacagaac 3180 
caacaattgg ctcaacaagc tcacctcgcc tttcttgaaa gccgcagtgc gggtatgaag 3240 
gtggctgatg ctttattgaa gcaacagcta gctcaagtaa caggccaaac tatcgataat 3300 
caggccctcg atactcaagc cgtcgatact caaacaagcg agaatgtagc gattgccgca 3360 
gaatcaccag ttcaagttac aacacctgtt caagttacaa cacctgttca aatcagtgtt 3420 
gtggagttaa aaccagatca cgctaatgtg ccaccataca cgccgccagt gcctgcatta 3480 
aagccgtgta tctggaacta tgccgattta gttgagtacg cagaaggcga tatcgccaag 3540 
gtatttggca gtgattatgc cattatcgac agctactcgc gccgcgtacg tctaccgacc 3600 
actgactacc tgttggtatc gcgcgtgacc aaacttgatg cgaccatcaa tcaatttaag 3660 
ccatgctcaa tgaccactga gtacgacatc cctgttgatg cgccgtactt agtagacgga 3720 
caaatccctt gggcggtagc agtagaatca ggccaatgtg acttgatgct tattagctat 3780 
ctcggtatcg actttgagaa caaaggcgag cgggtttatc gactactcga ttgtaccctc 3840 
accttcctag gcgacttgcc acgtggcgga gataccctac gttacgacat taagatcaat 3900 
aactatgctc gcaacggcga caccctgctg ttcttcttct cgtatgagtg ttttgttggc 3960 
gacaagatga tcctcaagat ggatggcggc tgcgctggct tcttcactga tgaagagctt 4020 
gccgacggta aaggcgtgat tcgcacagaa gaagagatta aagctcgcag cctagtgcaa 4080 
aagcaacgct ttaatccgtt actagattgt cctaaaaccc aatttagtta tggtgatatt 4140 
cataagctat taactgctga tattgagggt tgttttggcc caagccacag tggcgtccac 4200 
cagccgtcac tttgtttcgc atctgaaaaa ttcttgatga ttgaacaagt cagcaaggtt 4260 
gatcgcactg gcggtacttg gggacttggc ttaattgagg gtcataagca gcttgaagca 4320 
gaccactggt acttcccatg tcatttcaag ggcgaccaag tgatggctgg ctcgctaatg 4380 
gctgaaggtt gtggccagtt attgcagttc tatatgctgc accttggtat gcatacccaa 4440 
actaaaaatg gtcgtttcca acctcttgaa aacgcctcac agcaagtacg ctgtcgcggt 4500 
caagtgctgc cacaatcagg cgtgctaact taccgtatgg aagtgactga aatcggtttc 4560 
agtccacgcc catatgctaa agctaacatc gatatcttgc ttaatggcaa agcggtagtg 4620 
gatttccaaa acctaggggt gatgataaaa gaggaagatg agtgtactcg ttatccactt 4680 
ttgactgaat caacaacggc tagcactgca caagtaaacg ctcaaacaag tgcgaaaaag 4740 
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gtatacaagc cagcatcagt caatgcgcca 
gagccaaaca agggcgttat tccgatttcc 
ccgaaccgtg tacctgatac agtgccattc 
ggcaatatcg aaaactgttt cgggccagag 
cgtacaccat gcggtgactt acaagtgacc 
ggcgacttta aaaagccatc atcgtgtatc 
tatttcgata aaaacagcca cggcgcagtg 
ctgcaaccta acggctttat ctcaggttac 
gagctgttct tccgtaactt agacggtagc 
ggtaaaacca tccgtaacga ctcacgttta 
atccaaagct ttagcttcga gctaagcact 
gtatttggct attttaaagg tgacgcactt 
gtcactcagc catggcatgt agctaacggc 
gataagagct gccgtcactt taatgcgcca 
ggtcagctga actttatcga cagtgttgaa 
ggttacttgt atgccgagcg caccattgac 
caccaagatc cggttatgcc aggctcctta 
gcttacgcta ttagtaaaga cttgggcgca 
ttatcgaaca tcaagtggaa gtatcgcggt 
atggatgtca gcattacttc aatcaaagat 
gccagcttga gtaaagatgg tctgcgcata 
gaagaatctg ta 



ttaatggcac aaattcctga tctgactaaa 4800 
catgttgaag caccaattac gccagactac 4860 
acgccgtatc acatgtttga gtttgctaca 4920 
ttctcaatct atcgcggcat gatcccacca 4980 
acacgtgtga ttgaagttaa cggtaagcgt 5040 
gctgaatatg aagtgcctgc agatgcgtgg 5100 
atgccatatt caattttaat ggagatctca 5160 
atgggcacaa ccctaggctt ccctggcctt 5220 
ggtgagttac tacgtgaagt agatttacgt 5280 
ttatcaacag tgatggccgg cactaacatc 5340 
gacggtgagc ctttctatcg cggcactgcg 5400 
aaagatcagc taggcctaga taacggtaaa 54 60 
gttgctgcaa gcactaaggt gaacctgctt 5520 
gctaaccagc cacactatcg tctagccggt 5580 
attgttgata atggcggcac cgaaggttta 5640 
ccaagtgatt ggttcttcca gttccacttc 5700 
ggtgttgaag caattattga aaccatgcaa 57 60 
gatttcaaaa atcctaagtt tggtcagatt 5820 
caaatcaatc cgctgaacaa gcagatgtct 5880 
gaagacggta agaaagtcat cacaggtaat 5940 
tacgaggtct tcgatatagc tatcagcatc 6000 

6012 



<21D> 86 
<211> 1629 
<212> DNA 

<213> Shewanella putrefaciens 



<400> 86 

atgaatccta cagcaactaa cgaaatgctt 
aatatcagtt ttgacgtgca agtgatggaa 
tacgtggtca atcatgccga ccacggcttt 
gaacaagcgg caaacagcac agatttacct 
gaaagcctag gcgacaataa tttccgccgc 
ggcgctatgg caaacggtat ttcatctgaa 
attttgtgtg gttcgtttgg agcagccggt 
aaccgtattc aagcagcgct gccaaatggc 
agcgagccag cattagagcg tggcagcgta 
gttgaagcat cagctttctt aggtctaaca 
ttgagccgag acgcacaagg taaagttgtg 
cgcaccgaag tggctgaaaa gtttatgatg 
gttgatgacg gttcaattac cgctgagcaa 
gacgacatca ctgcagaggc cgattcaggt 
ttgctgccaa ccattttagc gctgaaagaa 
cctattcgtg tcggttgtgg tggcggtgtg 
aacatgggcg cggcgtatat tgttaccggc 
gcaagtgarc acactcgtaa attacttgcc 
ccagctgcag atatgttcga gatgggcgta 



tctccgtggc catgggctgt gacagagtca 60 
caacaactta aagattttag ccgggcatgt 120 
ggtattgcgc aaactgccga tatcgtgact 180 
gttagtgctt ttactcctgc attaggtacc 240 
gttcacggcg ttaaatacgc ttattacgca 300 
gagctagtga ttgccctagg tcaagctggc 360 
cttattccaa gtcgcgttga agcggcaatt 420 
ccttatatgt ttaaccttat ccatagtcct 480 
gagctatttt taaagcataa ggtacgcacc 540 
ccacaaatcg tctattaccg tgcagcagga 600 
gttggtaaca aggttatcgc taaagtaagt 660 
ccagcgcccg caaaaatgct acaaaaacta 720 
atggagctgg cgcaacttgt acctatggct 780 
ggccatactg ataaccgtcc attagtaaca 840 
gaaattcaag ctaaatacca atacgacact 900 
ggtacgcctg atgcagcgct ggcaacgttt 960 
tctatcaacc aagcttgtgt tgaagcgggc 1020 
accactgaaa tggccgatgt gactatggca 1080 
aaactgcagg tggttaagcg cggcacgcta 1140 
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ttcccaatgc gcgctaacaa gctatatgag atctacaccc gttacgattc aatcgaagcg 1200 
atcccattag acgagcgtga aaagcttgag aaacaagtat tccgctcaag cctagatgaa 1260 
atatgggcag gtacagtggc gcactttaac gagcgcgacc ctaagcaaat cgaacgcgca 1320 
gagggtaacc ctaagcgtaa aatggcattg attttccgtt ggtacttagg tctttctagt 1380 
cgctggtcaa actcaggcga agtgggtcgt gaaatggatt atcaaatttg ggctggccct 1440 
gctctcggtg catttaacca atgggcaaaa ggcagttact tagataacta tcaagaccga 1500 
aatgccgtcg atttggcaaa gcacttaatg tacggcgcgg cttacttaaa tcgtattaac 1560 
tcgctaacgg ctcaaggcgt taaagtgcca gcacagttac ttcgctggaa gccaaaccaa 1620 

1 62 9 
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